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INTRODUCTION 



The RISC-based Am29000 Streamlined Instruction Processor from Advanced Micro Devices is the high- 
performance solution for your general-purpose embedded systems needs. As the heart of the 29K Family, this 32- 
bit Cf^OS microprocessor delivers outstanding performance, yet offers flexible cost-effective solutions that can 
quickly move your product to market. 

This data book is your comprehensive guide to AMD's 29K Family of microprocessors and development tools. 
These products have helped current developers create applications that fully exploit the power of the Am29000 
microprocessor: laser printers of all types, real-time graphics systems, networks and bridges, and a host of other 
peripheral and communication devices. 

To provide a total system solution for you, AMD has taken the 29K Family's advantages of 17-MIPS performance, 
flexible memory-configuration requirements, and outstanding development tools and coupled them with our 
Fusion29KT" program. This program provides you with AMD and industry-standard third-party solutions, including 
the application-specific solutions you need for successful system integration that can substantially shorten the 
time-to-market factor of your design. 

AMD is committed to the 29K Family, and will continue to apply substantial resources to ensure that the present 
levels of high performance, cost and design flexibility, and rapid design cycles are maintained and further 
enhanced. Qualified support is readily available for our customers — our highly trained field applications engineers 
are backed by experts in the factory. For further details on how the 29K Family can be the solution to your design 
needs, call your local AMD sales office or the authorized representative listed in the back of this publication. 




Geoff Tate 

Senior Vice President 

Microprocessors & Peripherals Group 
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Advanced Micro Devices' 29K'*' Family is a new generation of high-performance CMOS microprocessor compo- 
nents and associated software tools. The heart of the 29K Family is the RISC-based Am29000™ microprocessor. 
The Am29000 Streamlined Instruction Processor is a high-performance, general-purpose, 32-bit microprocessor 
that supports a variety of applications, by virtue of a flexible architecture and rapid execution of simple instruc- 
tions which are common to a wide range of tasks. The 29K Family's microprocessors are fully described in 
Chapter 1 . 

The Am29000 Streamlined Instruction Processor efficiently performs operations common to all systems, while 
deferring most decisions on system policies to the system architect. It is well suited for applications in high- 
performance workstations, general-purpose super minicomputers, high-performance real-time controllers, laser 
printer controllers, network protocol converters, and many other applications where high performance, flexibility, 
and the ability to program using standard software tools is important. 

The Am29000 microprocessor has been enhanced to support byte and half-word loads and stores. This feature is 
provided as an option, requiring that an external device or memory be able to write individual bytes and/or half- 
words of a word. The Am29000 microprocessor can perform all necessary padding, sign extension, and alignment 
within the word. Furthermore, this feature is defined to be compatible with existing 29K Family software. 

The Am29027f" Arithmetic Accelerator is a high-computational unit intended for use with the Am29000 Stream- 
lined Instruction Processor. It connects directly to the Am29000 microprocessor's system buses, and requires no 
additional interface circuitry. When added to an Am29000 microprocessor-based system, the Am29027 co- 
processor can improve floating-point performance by an order of magnitude or more. The Am29027 co-processor 
implements an extensive floating-point and integer instruction set, and can perform operations on single-, double-, 
or mixed-precision operands. 

But the superior performance of the 29K Family of microprocessors is only part of the story: AMD also provides a 
comprehensive set of software and hardware development tools, as shown in Chapter 2. These tools, coupled 
with the growing number of development products from established third-party vendors, can drastically reduce the 
time-to-market factor of designs. 

For software development, AMD offers the globally optimizing HighC29K™ Cross-Development Toolkit, complete 
with high-performance math libraries. The HighC29K compiler is packaged with the ASM29K™ Cross-Develop- 
ment Toolkit, which includes a relocatable macro assembler, linker/loader, librarian, and a full architectural 
simulator of the Am29000 microprocessor. 

Several debugging tools are available, including the XRAY29K™, a source-level debugger for high-level and as- 
sembly-level debugging and the software-based MON29K^" target-resident debugger/monitor. All tools work at 
the Am29000 processor's clock rate to allow debugging while operating at full microprocessor speed. 

The application notes in Chapter 3 make development with the 29K Family of silicon and tools a simpler task. 
Within these documents, AMD engineers explore solutions of common problems that stand as roadblocks in your 
development path. So whether you need general information on programming standalone Am29000 microproces- 
sor-based systems or detailed specifics on how to make your product HIF compatible, these application notes can 
provide the answers. And with new notes constantly being written and released, this wealth of knowledge will 
continue to be integral to your development process. 
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DISTINCTIVE CHARACTERISTICS 

Full 32-blt, three-bus architecture 

23 million instructions per second (iVIIPS) 

sustained at 33 IVlHz 

33-, 25-, 20-, and 16-MHz operating frequency 

Efficient execution of high-level language 

programs 

CMOS technology 

4-gigabyte virtual address space with demand 

paging 

Concurrent instruction and data accesses 



Burst-mode access support 
192 general-purpose registers 
51 2-byte Branch Target Cache^" 
64-entry Memory-IVIanagement Unit 

Demultiplexed, pipelined address, instruction, 

and data buses 

Three-address instruction architecture 

On-chip byte-alignment support allows 
optional byte/haif-word accesses 
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GENERAL DESCRIPTION 

The Am29000^" Streamlined Instaiction Processor is a 
high-performance, general-purpose, 32-bit micropro- 
cessor implemented in CMOS technology. It supports a 
variety of applications by virtue of a flexible architecture 
and rapid execution of simple instructions that are com- 
mon to a wide range of tasks. 

The Am29000 efficiently performs operations common 
to all systems, while deferring most decisions on system 
policies to the system architect. It is well-suited for ap- 
plication in high-performance workstations, general- 
purpose super-minicomputers, high-performance real- 
time controllers, laser printer controllers, network 
protocol converters, and many other applications where 
high performance, flexibility, and the ability to program 
using standard software tools is important. 



Am29000 



The Am29000 instaiction set has been influenced by the 
results of high-level language, optimizing compiler re- 
search. It is appropriate for a variety of languages 
because it efficiently executes operations that are com- 
mon to all languages. Consequently, the Am29000 is an 
ideal target for high-level languages such as C, FOR- 
TRAN, Pascal, Ada, and COBOL. 

The processor is available in two packaging options: a 
1 69-lead pin-grid-array (PGA) package, and a 1 64-iead 
Ceramic Quad Flat Pack (CQFP) package for the mili- 
tary. The PGA has 141 signal pins, 27 power and ground 
pins, and 1 alignment pin. The CQFP has 141 signal 
pins and 23 power and ground pins. A representative 
system diagram is shown on page 1 . 



29K^" Family Development Support Products 

Contact your local AMD representative for information 
on the complete set of development support tools. 

Software development products on several hosts: 

■ Optimizing compilers for common high-level 
languages 

" Assembler and utility packages 

■ Source- and assembly-level software 
debuggers 

■ Target-resident development monitors 

■ Simulators 

Hardware Development: 

■ ADAPT29KTM Advanced Development and 
Prototyping Tool 



RELATED AMD PRODUCTS 

Am29000 Peripheral Devices 



Part No. 



Description 



Am29027T" 



Arithmetic Accelerator 
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CONNECTION DIAGRAM 
169-Lead PGA* 



Bottom View 
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CONNECTION DIAGRAM 
164-LeadCQFP 



Am29000 



Top View 
(Lid Facing Viewer) 
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PGA PIN DESIGNATION 
(Sorted by Pin No.) 


Pin No. 


Pin Name 


Pin No. 


Pin Name 


Pin No. 


Pin Name 


Pin No. 


Pin Name 


A-1 


GND 


C-10 


GND 


J-1 6 


A16 


R-1 2 


STAT2 


A-2 


h 


C-11 


GND 


J-17 


Al4 


R-1 3 


GND 


A-3 


lo 


C-1 2 


D22 


K-1 


|26 


R-1 4 


OPTo 


A-4 


D2 


C-13 


D26 


K-2 


l2S 


R-1 5 


A2 


A-5 


D4 


C-14 


Vcc 


K-3 


GND 


R-1 6 


As 


A-6 


De 


C-15 


D30 


K-1 5 


Vcc 


R-1 7 


A7 


A-? 


Da 


C-16 


Dai 


K-1 6 


A12 


T-1 


INCLK 


A-8 


Dii 


C-17 


A29 


K-1 7 


Al3 


T-2 


BREQ 


A-9 


Dl2 


D-1 


111 


L-1 


l27 


T-3 


DERR 


A-10 


Du 


0-2 


ho 


L-2 


l28 


T-4 


IRDY 


A-11 


Die 


D-3 


I7 


L-3 


Vcc 


T-5 


WARN 


A-1 2 


Dl8 


D-4 


PIN169 


L-1 5 


Vcc 


T-6 


INTR2 


A-1 3 


D20 


D-1 5 


A31 


L-1 6 


A10 


T-7 


INTRo 


A-1 4 


D21 


D-1 6 


A28 


L-17 


All 


T-8 


BINV 


A-1 5 


D25 


D-1 7 


A26 


iUI-1 


I29 


T-9 


BGRT 


A-1 6 


D27 


E-1 


Il3 


l\/l-2 


l30 


T-10 


DREQ 


A-1 7 


GND 


E-2 


il2 


M-3 


GND 


T-11 


LOCK 


B-1 


ie 


E-3 


Vcc 


M-15 


GND 


T-1 2 


MSERR 


B-2 


Is 


E-1 5 


GND 


M-16 


Ao 


T-1 3 


STATo 


B-3 


I3 


E-1 6 


A27 


IVI-17 


Ai 


T-1 4 


SUP/DS 


B-4 


Do 


E-17 


A23 


N-1 


l31 


T-1 5 


OPT1 


B-S 


Di 


F-1 


I16 


N-2 


TEST 


T-16 


A3 


B-6 


Ds 


F-2 


lis 


N-3 


SYSCLK 


T-1 7 


A4 


B-7 


Ds 


F-3 


lu 


N-1 5 


GND 


U-1 


GND 


B-8 


D10 


F-1 5 


A25 


N-1 6 


MPGMi 


U-2 


PEN 


B-9 


Dl3 


F-1 6 


A24 


N.17 


MPGMo 


U-3 


lERR 


B-10 


Dl5 


F-1 7 


A21 


P-1 


CNTLi 


U-4 


IBACK 


B-11 


Dl7 


G-1 


Il9 


P-2 


ONTLo 


U-5 


INTR3 


B-1 2 


Dl9 


G-2 


I18 


P-3 


PWROLK 


U-6 


INTR1 


B-1 3 


D23 


G-3 


Il7 


P-1 5 


As 


U-7 


TRAPb 


B-1 4 


D24 


G-1 5 


A22 


P-1 6 


Aa 


U-8 


IBREQ 




D28 


G-1 6 


A20 


P-1 7 


A9 


U-9 




B-15 


IREQ 




D29 


G-1 7 


Al9 


R-1 




U-10 




B-1 6 


RESET 


PIA 


B-1 7 


A30 


H-1 


I20 


R-2 


GDA 


U-11 


R/W 


C-1 


I9 


H-2 


I22 


R-3 


DRDY 


U-1 2 


DREQTi 


C-2 


Is 


H-3 


l21 


R-4 


DBACK 


U-1 3 


DREQTo 


C-3 


u 


H-15 


GND 


R-5 


GND 


U-1 4 


STATi 


C-4 


I2 


H-16 


A18 


R-6 


Vcc 


U-1 5 


IREOT 


C-5 


GND 


H-1 7 


Al7 


R-7 


TRAPi 


U-1 6 


OPT2 


C-6 


D3 


J-1 


I23 


R-8 


GND 


U-1 7 


GND 


C-7 


D7 


J-2 


I24 


R-9 


DBREQ 






C-8 


Vcc 


J-3 


GND 


R-10 


PDA 






C-9 


Vcc 


J-1 5 


Al5 


R-11 


Vcc 






Note: Pin Number D-4 is the alignment pin an 


d is electrically connect 


ed to the pack 


age lid. 
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PGA PIN DESIGNATIONS 
(Sorted by Pin Name) 


Pin No. 


Pin Name 


Pin No. 


Pin Name 


Pin No. 


Pin Name 


Pin No. 


Pin Name 


M-16 


Ao 


B-6 


Ds 


K-3 


GND 


T-1 


INCLK 


M-17 


Ai 


A-6 


Ds 


N-15 


GND 


T-7 


INTRo 


R-15 


A2 


C-7 


D7 


R-5 


GND 


U-6 


INTR, 


T-16 


A3 


B-7 


Ds 


U-1 


GND 


T-6 


INTR2 


T-17 


A4 


A-7 


D9 


R-13 


GND 


U-5 


INTRs 


P-15 


As 


B-8 


D10 


R-8 


GND 


T-4 


IRDY 




Ae 


A-8 


Di, 


M-3 


GND 


U-9 




R-16 


IREQ 


R-17 


A7 


A-9 


D12 


U-1 7 


GND 


U-15 


IREQT 


P-16 


As 


B-9 


Dl3 


A-3 


lo 


T-11 


LOCK 


P-17 


A9 


A-10 


Dl4 


A-2 


1, 


N-17 


MPGMo 


L-16 


Aid 


B-10 


D15 


C-4 


I2 


N-16 


MPGIVl, 


L-17 


All 


A-11 


D16 


B-3 


I3 


T-1 2 


MSERR 


K-16 


Al2 


B-11 


Dl7 


C-3 


|4 


R-14 


OPTo 


K-17 


Al3 


A-12 


D18 


B-2 


Is 


T-1 5 


OPT, 


J-17 


Al4 


B-12 


Dl9 


B-1 


Is 


U-16 


OPT2 


J-15 


Al5 


A-13 


D20 


D-3 


17 


R-10 


PDA 


J-16 


Al6 


A-14 


D21 


C-2 


Is 


U-2 


PEN 


H-17 


Al7 


C-12 


D22 


C-1 


I9 


U-10 


PIA 


H-16 


Al8 


B-13 


D23 


D-2 


ho 


D-4 


PiN169 


G-17 


Al9 


B-14 


D24 


D-1 


1,1 


P-3 


PWRCLK 


G-16 


Azo 


A-15 


D2S 


E-2 


1,2 


U-11 


R/W 


F-17 


Azi 


C-13 


D26 


E-1 


1,3 


R-1 


RESET 


G-15 


A22 


A-16 


D27 


F-3 


1,4 


T-13 


STATo 


E-17 


A23 


B-15 


D28 


F-2 


l,S 


U-14 


STAT, 


F-16 


A24 


B-16 


D29 


F-1 


1,6 


R-1 2 


STAT2 


F-15 


A25 


C-15 


D30 


G-3 


1,7 


T-1 4 


SUP/DS 


D-17 


A26 


C-16 


Ds, 


G-2 


l,S 


N-3 


SYSCLK 


E-16 


A27 


R-4 


DBACK 


G-1 


1,9 


N-2 


TEST 


D-16 


A2a 


R-9 


DBREQ 


H-1 


I20 


U-7 


TRAPo 


C-17 


A29 


T-3 


DERR 


H-3 


I2, 


R-7 


TRAP, 


B-17 


Asa 


R-3 


DRDY 


H-2 


l22 


C-14 


Vcc 


D-15 


A31 


T-10 


DREQ 


J-1 


l23 


L-1 5 


Vcc 


T-9 


BGRT 


U-13 


DREQTo 


J-2 


I24 


C-8 


Vcc 


T-8 


BINV 


U-12 


DREQTi 


K-2 


I25 


C-9 


Vcc 


T-2 


BREQ 


E-15 


GND 


K-1 


l26 


E-3 


Vcc 


R-2 


CDA 


H-15 


GND 


L-1 


I27 


K-1 5 


Vcc 


P-2 


CNTLo 


M-15 


GND 


L-2 


i28 


L-3 


Vcc 


P-1 


CNTLi 


C-10 


GND 


M-1 


l29 


R-6 


Vcc 


B-4 


Do 


A-1 


GND 


M-2 


l30 


R-11 


Vcc 


B-5 


Di 


A-17 


GND 


N-1 


Is, 


T-5 


WARN 




D2 


C-5 


GND 


U-4 








A-4 


iBACK 




C-6 


D3 


C-11 


GND 


U-8 


IBREQ 






A-5 


D4 


J-3 


GND 


U-3 


lERR 






Note: Pin Number D-4 is the alignment p 


n and is electrically c 


:onnected to 


the package lid. 
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CQFP PIN DESIGNATION 
(Sorted by Pin No.) 


Pin No. 


Pin Name 


Pin No. 


Pin Name 


Pin No. 


Pin Name 


Pin No. 


Pin Name 


1 


CDA 


42 


Vcc 


83 


Vcc 


124 


GND 


2 


INCLK 


43 


b 


84 


GND 


125 


OPTo 


3 


PWRCLK 


44 


l2 


85 


A31 


126 


OPT1 


4 


SYSCLK 


45 


ll 


86 


A30 


127 


OPT2 


5 


GND 


46 


GND 


87 


A29 


128 


SUP/DS 


6 


Vcc 


47 


lo 


88 


A28 


129 


IREQT 


7 


GND 


48 


Do 


89 


A27 


130 


STATo 


8 


RESET 


49 


Di 


90 


A26 


131 


STATi 


9 


CNTLo 


50 


D2 


91 


A25 


132 


STAT2 


10 


CNTLi 


51 


Da 


92 


A24 


133 


MSERR 


11 


TEST 


52 


D4 


93 


A23 


134 


DREQTo 


12 


l31 


53 


Ds 


94 


A22 


135 


DREQTi 


13 


l30 


54 


De 


95 


A2I 


136 


LOCK 


14 


l29 


55 


D7 


96 


A20 


137 


R/W 


15 


128 


56 


Ds 


97 


A19 


138 


DREQ 


16 


|27 


57 


D9 


98 


A18 


139 


PDA 


17 


l26 


58 


Dio 


99 


A17 


140 


PIA 


18 


125 


59 


Dii 


100 


A16 


141 


IREQ 


19 


124 


60 


Dl2 


101 


Al5 


142 


BGRT 


20 


GND 


61 


Dl3 


102 


GND 


143 


DBREQ 




Vcc 


62 


Dl4 


103 


Vcc 


144 




21 


IBREQ 


22 


l23 


63 


Vcc 


104 


Au 


145 


BINV 


23 


l22 


64 


GND 


105 


A13 


146 


Vcc 


24 


121 


65 


Dl5 


106 


Al2 


147 


GND 


25 


l20 


66 


Die 


107 


A11 


148 


Vcc 


26 


Il9 


67 


Dl7 


108 


A10 


149 


GND 


27 


il8 


68 


Dl8 


109 


Ai 


150 


TRAR) 


28 


Il7 


69 


Dl9 


110 


Ao 


151 


TRAPi 


29 


lie 


70 


D20 


111 


MPGMo 


152 


INTRo 


30 


lis 


71 


D21 


112 


MPGMi 


153 


INTR1 


31 


il4 


72 


D22 


113 


Vcc 


154 


INTR2 


32 


|13 


73 


D23 


114 


A9 


155 


INTR3 


33 


it2 


74 


D24 


115 


As 


156 


WARN 


34 


111 


75 


D25 


116 


A7 


157 


IBACK 


35 


lio 


76 


D26 


117 


As 


158 


IRDY 


36 


l9 


77 


D27 


118 


As 


159 


lERR 


37 


la 


78 


D28 


119 


A4 


160 


DERR 


38 


l7 


79 


D29 


120 


A3 


161 


DBACK 


39 


is 


80 


Dao 


121 


A2 


162 


PEN 


40 


Is 


81 


D31 


122 


GND 


163 


BREQ 


41 


U 


82 


GND 


123 


GND 


164 


DRDY 
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CQFP PIN DESIGNATIONS 
(Sorted by Pin Name) 


Pin No. 


Pin Name 


Pin No. 


Pin Name 


Pin No. 


Pin Name 


Pin No. 


Pin Name 


110 


Ao 


51 


D3 


82 


GND 


144 


IBREQ 


109 


Ai 


52 


D4 


84 


GND 


159 


lERR 


121 


A2 


53 


Ds 


102 


GND 


2 


INCLK 


120 


A3 


54 


De 


122 


GND 


152 


INTRo 


119 


A4 


55 


D7 


123 


GND 


153 


INTRi 


118 


As 


56 


Ds 


124 


GND 


154 


INTR2 




As 


57 


Da 


147 


GND 


155 




117 


INTR3 


116 


At 


58 


D10 


149 


GND 


158 


IRDY 




As 


59 


D11 


47 


lo 


141 




115 


IREQ 


114 


A9 


60 


D12 


45 


I1 


129 


IREQT 


108 


Aio 


61 


Dl3 


44 


I2 


136 


LOCK 


107 


All 


62 


Dl4 


43 


I3 


111 


iViPGiVIo 


106 


Al2 


65 


Dl5 


41 


U 


112 


MPGiVit 


105 


Al3 


66 


D16 


40 


is 


133 


iVISERR 


104 


Al4 


67 


Dl7 


39 


is 


125 


OPTo 


101 


Ais 


68 


D18 


38 


I7 


126 


OPTt 


100 


Al6 


69 


Dl9 


37 


is 


127 


OPT2 


99 


Al7 


70 


D20 


36 


Is 


139 


PDA 


98 


Al8 


71 


D21 


35 


ilO 


162 


PEN 


97 


Aig 


72 


D22 


34 


hi 


140 


PiA 


96 


Aso 


73 


D23 


33 


il2 


3 


PWRCLK 


95 


A21 


74 


D24 


32 


ll3 


137 


R/W 




A22 


75 


D25 


31 


lt4 


8 




94 


RESET 


93 


A23 


76 


D26 


30 


lis 


130 


STATo 


92 


A24 


77 


D27 


29 


I16 


131 


STATi 


91 


Azs 


78 


D28 


28 


il7 


132 


STAT2 


90 


A26 


79 


D29 


27 


Its 


128 


SUP/US 


89 


A27 


80 


D30 


26 


Its 


4 


SYSCLK 


88 


A28 


81 


D31 


25 


120 


11 


TEST 


87 


A29 


161 


DBACK 


24 


I21 


150 


TRAPo 




A30 


143 




23 


i22 


151 




86 


DBREQ 


TRAPi 




A31 


160 




22 


I23 


6 




85 


DERR 


Vcc 


142 


BGRT 


164 


DRDY 


19 


I24 


21 


Vcc 


145 


BINV 


138 


DREQ 


18 


|2S 


42 


Vcc 


163 


BREQ 


134 


DREQTo 


17 


126 


63 


Vcc 


1 


CDA 


135 


DREQTi 


16 


I27 


83 


Vcc 


9 


CNTLo 


5 


GND 


15 


128 


103 


Vcc 


10 


CNTLi 


7 


GND 


14 


I29 


113 


Vcc 


48 


Do 


20 


GND 


13 


l30 


146 


Vcc 


49 


Di 


46 


GND 


12 


l31 


148 


Vcc 


50 


D2 


64 


GND 


157 


IBACK 


156 


WARN 
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LOGIC SYMBOL 



^ 



32 



i> 



ERR3 
PER 

lERR 

Dray? 

15IRR 

IFITRa-IRTR, 

CNTL,-CNTLo 
RESET 

TEST 

INCLK 
"mSPi-TRSPo 

I31-I0 

PWRCLK 



BSRT 
BTRV 

R/W 

SUP/DS 
tSCR 

MPGM.-MPGMo 
IRES 

PM 



DREQTi-DREQTo 

MSERR 
15RE5 

OPTr-OPTo 

STATr-STATo 

IREQT 
PK 



SYSCLK D3,-Do 



HREQ 



' Ajt-Ao 



3=> 



32 



> 
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ORDERING INFORMATION 
Standard Products 



AMD standard products are available in several packages and operating ranges. The ordering number 
(Valid Combination) is formed by a combination of: a. Device Number 

b. Speed Option (if applicable) 

c. Pacl<age Type 

d. Temperature Range 

e. Optional Processing 



AIUI29000 



-25 



T 



a. DEVICE NUMBER/DESCRIPTION 

Am29000 

Streamlined Instruction Processor 



G. OPTIONAL PROCESSING 

Blank = Standard Processing 
B = Burn-in 

d. TEMPERATURE RANGE 

C = Commercial (Tc= Oto +85°C) 



c. PACKAGE TYPE 

G = 169-Lead Pin Grid Array without 
Heat Sink (CGX1 69) 

b. SPEED OPTION 

-33 = 33 MHz 
-25 = 25 MHz 
-20 = 20 MHz 
-16= 16 MHz 



Valid Combinations 



AM29000-33 
AM29000-25 
AM29000-20 
AM29000-16 



GC. GCB 



Valid Combinations 

Valid Combinations list configurations planned to 
be supported in volume for this device. Consult 
the local AMD sales office to confirm availability of 
specific valid combinations, to check on newly 
released combinations, and to obtain additional 
data on AMD's standard military grade products. 



1-15 



29K Family CMOS Devices 



ORDERING INFORMATION 
APL Products 



AMD products for Aerospace and Defense applications are available in several packages and operating 
ranges. APL (Approved Products List) products are fully compliant with MIL-STD-883C requirements. The 
ordering number (Valid Combination) is formed by a combination of: a. Device Number 

b. Speed Option (If applicable) 

c. Device Class 

d. Package Type 

e. Lead Finish 



AM2g000 



-20 



/B 



T 



e. LEAD FINISH 

C = Gold 

d. PACKAGE TYPE 

Z = 169-Lead Pin Grid Array without Heatsink 

(CGX169) 
Y «= 164-Lead Ceramic Quad Flat Pack without 

Heatsink 

0. DEVICE CLASS 

/B = Class B 



a. DEVICE NUMBER/DESCRIPTION 

Am29000 

Streamlined Instruction Processor 



b. SPEED OPTION 

-20 = 20 MHz 
-16= 16 MHz 



Valid Combinations 


AM29000-20 
AM29000-16 


/BZC 


AM29000-20 
AM29000-16 


/BYC 



Valid Combinations 

Valid Combinations list configurations planned to 
be supported in volume for this device. Consult 
the local AMD sales office to confirm availability of 
specific valid combinations, to check on newly 
released combinations, and to obtain additional 
data on AMD's standard military grade products. 



Group A Tests 

Group A tests consist of Subgroups 
1,2,3,7,8,9,10,11. 
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PIN DESCRIPTION 

Although certain outputs are described as being three- 
state or bidirectional outputs, all outputs (except 
MSERR) may be placed in a high-impedance state by 
the Test mode. The three-state and bidirectional temii- 
nology in this section is for those outputs (except 
SYSCLK) that are disabled when the processor grants 
the channel to another master. 

A31-A0 

Address Bus (three-state output, synchronous) 

The Address Bus transfers the byte address for all ac- 
cesses except burst-mode accesses. For burst-mode 
accesses, it transfers the address for the first access in 
the sequence. 



BGRT 

Bus Grant (output, synchronous) 

This output signals to an external master that the 
processor i s relinq uishing control of the channel in 
response to BREQ. 



BINV 

Bus Invalid (output, synchronous) 

This output indicates that the address bus and related 
controls are invalid. It defines an idle cycle for the 
channel. 



BREQ 

Bus Request (input, synchronous) 

This input allows other masters to arbitrate for control of 
the processor channel. 



CDA 



Coprocessor Data Accept (input, synchronous) 

This signal allows the coprocessor to indicate the ac- 
ceptance of operands or operation codes. For transfers 
to the coprocessor, the processor does not expect a 
DRDY response; an active level on CD A perfo rms the 
function normally performed by DRDY. CDA may be 
active whenever the coprocessor is able to accept 
transfers. 

CNTL1-CNTL0 

CPU Control (input, asynchronous) 



DBACK 

Data Burst Acknowledge (input, synchronous) 

This input is active whenever a burst-mode data access 
has been established. It may be active even though no 
data are currently being accessed. 



DBREQ 

Data Burst Request (three-state output, 
synchronous) 

This signal is used to establish a burst-mode data ac- 
cess and to r equest d ata transfers during a burst-mode 
data access. DBREQ may be active even though the ad- 
dress bus is being used for an instruction access. This 
signal becomes valid late in the cycle, with respect to 
DREQ. 



DERR 

Data Error (input, synchronous) 

This input indicates that an error occurred during the 
current data access. For a load, the processor ignores 
the content of the data bus. For a store, the access is ter- 
minated. In either case, a Data Access Exception trap 
occurs. The processor ignores this signal if there is no 
pending data access. 



DRDY 

Data Ready (input, synchronous) 

For loads, this input indicates that valid data is on the 
data bus. For stores, it indicates that the access is com- 
plete, and that data need no longe r be driven on the data 
bus. The processor ignores this signal if there is no 
pending data access. 



DREQ 

Data Request (three-state output, synchronous) 

This signal requests a data access. When it is active, the 
address for the access appears on the address bus. 

DREQT1-DREQT0 

Data Request Type 

(three-state output, synchronous) 

These signals specify the address space of a data ac- 
cess, as follows (the value "x" is a "don't care"): 



These inputs control the processor mode: 


DREQT, 


DREQTo 


Meaning 


CNTL, 


CNTLo 


Mode 




1 



1 

X 


Instruction/data 





1 
1 




1 


1 


Load Test 

Instruction 

Step 

Halt 

Normal 


memory access 

Input/output 

access 

Coprocessor 

transfer 











D31-D0 

Data Bus (bidirectional, synchronous) 

The Data Bus transfers data to and from the processor 
for load and store operations. 



An interruptArap vector request is indicated as a data- 
memory read. If required, the system can identify 
the vector fetch by the STAT^- STATo outputs. 
DREQT1-DREQT0 are valid only when DREQ is active. 
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I31— lo 

Instruction Bus (Input, synchronous) 

The Instmction Bus transfers instructions to the 
processor. 



IBACK 

Instruction Burst Acknowledge 
(input, synchronous) 

This input is active whenever a burst-mode instruction 
access has been established, it may be active even 
though no instructions are currently being accessed. 



IBREQ 

Instruction Burst Request (three-state 

output, synchronous) 

This signal is used to establish a burst-mode instruction 
access and to request instruct ion tran sfers during a 
burst-mode instruction access. IBREQ may be active 
even though the address bus is being used for a data ac- 
cess. Th is sign al becomes valid late in the cycle with re- 
spect to I REQ. 



lERR 

Instruction Error (Input, synchronous) 

This input indicates that an error occun-ed during the 
current instruction access. The processor ignores the 
content of the instruction bus, and an Instruction Access 
Exception trap occurs if the processor attempts to exe- 
cute the invalid instruction. The processor ignores this 
signal if there is no pending instruction access. 

INCLK 

Input Clock (Input) 

Whentheprocessorgeneratesthe clock forthe system, 
this is an oscillator input to the processor at twice the 
processor's operating frequency. In systems where the 
clocl< is not generated by the processor, this signal must 
be tied High or Low, except in certain master/slave con- 
figurations. 



INTRHNTRo 

Interrupt Request (Input, asynchronous) 

These inputs generate priori tized intermpt requests. 
The interrupt caused by IN TRo ha s the highest priority, 
and the interrupt caused by INTR3 has the lowest prior- 
ity. The interrupt requests are masked in prioritized or- 
der by the interrupt Mask field in the Current Processor 
Status Register. 



IRDY 

Instruction Ready (Input, synchronous) 

This input indicates that a valid instruction is on the in- 
struction bus. The processor ignores this signal if there 
is no pending instruction access. 



IREQ 

Instruction Request 

(three-state output, synchronous) 

This signal requests an instmction access. When it is 
active, the address for the access appears on the ad- 
dress bus. 

IREQT 

Instruction Request Type 
(three-state output, synchronous) 

This signal sp ecifies the address space of an instmction 
request when IREQ is active: 



IREQT 



Meaning 



Instruction/data memory access 
Instruction read-only memory 
access 



LOCK 

Lock (three-state output, synchronous) 

This output allows the implementation of various chan- 
nel and device interlocks. It may be active only for the 
duration of an access, or active for an extended period 
of time under control of the Lock bit in the Current 
Processor Status. 

MPGM1-MPGM0 
MMU Programmable 
(three-state output, synchronous) 

These outputs reflect the value of two PGM bits in the 
Translation Look-Aside Buffer entry associated with the 
access, if no address translation is performed, these 
signals are both Low. 

MSERR 

Master/Slave Error (output, synchronous) 

This output shows the result of the comparison of 
processor outputs with the signals provided internally to 
the off-chip drivers. If there is a difference for any en- 
abled driver, this line is asserted. 
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until the first access is complete. The com pletion of the 
first access is signaled by the assertion of IREQ. 

R/W 

ReadA/Vrite (three-state output, synchronous) 

This signal indicates whether data is being transferred 
from the processor to the system, or from the system to 
the processor. R/W is valid on ly whe n the address bus is 
valid. R/W will be High when IREQ is active. 



OPTz-OPTo 

Option Control 

(three-state output, synchronous) 

These outputs reflect the value of bits 1 8-1 6 of the load 
or store Instruction that begins an access. Bit 18 of the 
instruction is reflected on OPTz, bit 17 on OPTi, and bit 
16onOPTo. 

The standard definitions of these signals (based on 
DREQT) are as follows (the value "x" is a "don't care"): 

DREQT, DREQTo OPT, OPT, OPT. Meaning 






X 











Word- 
length 
access 





X 








1 


Byte 
access 





X 





1 





Half-word 
access 








1 








Instruction 
ROI^ 
access 
(as data) 








1 





1 


Cache 
control 








1 

-all others- 


1 





ADAPT29K 

accesses 

Reserved 



During an interrupt/trap vector fetch, the OPTz-OPTo 
signals indicate a word-length access (000). Also, the 
system should return an entire aligned word for a read, 
regardless of the indicated data length. 

The Am29000 does not explicitly prevent a store to the 
instruc tion ROM. OPTa-OPTo are valid only when 
DREQ is active. 



PDA 

Pipelined Data Access 
(three-state output, synchronous) 

If DREQ is not active, this output indicates that a data ac- 
cess is pipelined with another in-progress data access. 
The indicated access cannot be completed until the first 
access is complete. The co mpletion of the first access is 
signaled by the assertion of DREQ. 



PEN 

Pipeline Enable (Input, synchronous) 

This signal allows devices that can support pipelined ac- 
cesses (i.e., that have input latches for the address and 
required controls) to signal that a second access may 
begin while the first is being completed. 

piA 

Pipelined Instruction Access 
(three-state output, synchronous) 

If IREQ is not active, this output indicates that an instruc- 
tion access is pipelined with another in-progress instruc- 
tion access. The indicated access cannot be completed 



RESET 

Reset (input, asynchronous) 

This input places the processor in the Reset mode. 

STAT2-STAT0 

CPU Status (output, synchronous) 

These outputs indicate the state of the processor's exe- 
cution stage on the previous cycle. They are encoded 
as follows: 



STAT, STAT, STAT, 



Condition 













1 


Halt or Step Modes 
Pipeline Hold Mode 





1 





Load Test Instruc- 
tion Mode, 
Halt/Freeze 





1 


1 


Wait Mode 


1 
1 

1 





1 




1 




Interrupt Return 
Taking Interrupt or 
Trap 

Non-sequential 
Instruction Fetch 


1 


1 


1 


Executing Mode 



SUP/US 

Supervisor/User Mode 
(three-state output, synchronous) 

This output indicates the program mode for an access. 

The proce ssor d oes no t relinq uish the channel (in re- 
sponse to BREQ) when LOCK is active. 

SYSCLK 

System Clock (bidirectional) 

This is either a clock output with a frequency that is half 
that of INCLK, or an input from an external clock genera- 
tor at the processor's operating frequency. 



TEST 

Test Mode (input, asynchronous) 

When this input is active, the processor is in Test mode. 
All outputs and bidirectional lines, except MSERR, are 
forced to the state. 



TRAPi-TRAPo 

Trap Request (input, asynchronous) 

These inputs g enerat e prioritized trap requests. The 
trap caused by Ti^Pb has the highest priority. These 



1-19 



29K Family CMOS Devices 



whether or not the Am29000 generates the clock for the 
system. If power (+5 volts) is applied to this pin, the 
Am29000 generates a clock on the SYSCLK output. If 
this pin is grounded, the Am29000 accepts a clock gen- 
erated by the system on the SYSCLK input. 

PIN169 

Alignment pin 

In the PGA package, this pin is used to indicate proper 
pin-alignment of the Am29000 and is used by the 
ADAPT29K to communicate its presence to the system. 
This pin does not exist on the Am29000 in CQFP 
package. 



trap requests are disabled by the DA bit of the Current 
Processor Status Register. 



WARN 

Warn (input, asynchronous, edge-sensitive) 

A high-to- low tra nsition on this input causes a non- 
maskable WARN trap to occur. This trap bypasses the 
normal trap vector fetch sequence, and is useful in situ- 
ations where the vector fetch may not work (e.g., when 
data memory is faulty). 

The following pins are not signal pins, but are named in 
Am29000 documentation because of their special role 
in the processor and system. 

PWRCLK 

Power Supply for SYSCLK Driver 

This pin is a power supply for the SYSCLK output driver. 
It isolates the SYSCLK driver, and is used to determine 
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FUNCTIONAL DESCRIPTION 

Product Overview 

The Am29000 contains a high-function execution unit, a 
large register file (192 locations), a Branch Target 
Cache {32 4-bit instruction branch targets), a memory 
management unit (64 entries), and a high-bandwidth, 
pipelined external channel with separate instruction and 
data buses. The flexible register file may be used as a 
cache for run-time variables during program execution, 
or as a collection of register banks allocated to separate 
tasks in multitasking applications. 

The Am29000 provides a significant margin of per- 
formance over other processors in its class, since the 
majority of processor features were defined with the 
maximum achievable perfomnance in mind. This section 
describes the features of the Am29000 from the point of 
view of system performance. 

Cycle Time 

The processor operates at a frequency of 33 MHz. The 
processor cycle time is a single, 30-ns clock period. The 
processor interface drivers can drive 80-pF loads at this 
frequency (for greater loads see Capacitive Output 
Delay table). 

The Am29000 architecture and system interfaces are 
designed so that the processor cycle time can decrease 
with technology improvements. 

Four-Stage Pipeline 

The Am29000 utilizes a four-stage pipeline, allowing it 
to execute one instruction every clock cycle. The pro- 
cessor can complete an instruction on every cycle, even 
though four cycles are required from the beginning of an 
instruction to its completion. 

At a 33-MHz operating frequency, the maximum instruc- 
tion execution rate is 33 million instructions per second 
(MIPS). The Am29000 pipeline is designed so that the 
Am29000 can operate at the maximum instruction 
execution rate a significant portion of the time. 

Pipeline interlocks are implemented by processor hard- 
ware. Except for a few special cases, it is not necessary 
to rearrange programs to avoid pipeline dependencies. 

System Interface 

The Am29000 accesses external instmctions and data 
using three non-multiplexed buses. These buses are re- 
ferred to collectively as the channel. The channel proto- 
col minimizes the logic chains involved in a transfer, and 
provides a maximum transfer rate of 264 Mb/s. 

Separate Address, Instruction, and Data Buses 

The Am29000 incorporates two 32-bit buses for instruc- 
tion and data transfers, and a third address bus that is 
shared between instruction and data accesses. This 
bus structure allows simultaneous instruction and data 
transfers, even though the address bus is shared. The 



Am29000 

channel achieves the performance of four separate 
32-bit buses at a much-reduced pin count. 

Pipelined Addresses 

The Am29000 address bus is pipelined so that it can be 
released before an instruction or data transfer is com- 
pleted. This allows a subsequent access to begin before 
the first has been completed, and allows the processor 
to have two accesses in progress simultaneously. 

Support of Burst Devices and Memories 

Burst-mode accesses provide high transfer rates for 
instructions and data at sequential addresses. For such 
accesses, the address of the first instruction or datum 
is sent, and subsequent requests for instructions ordata 
at sequential addresses do not require additional 
address transfers. These instructions ordata are trans- 
ferred until either party involved in the transfer termi- 
nates the access. 

Burst-mode accesses can occur at the rate of one ac- 
cess per cycle after the first address has been pro- 
cessed. At 33 MHz, the maximum achievable transfer 
bandwidth for either instructions or data is 132 Mb/s. 

Burst-mode accesses may occur to input/output de- 
vices if the system design permits. 

Interface to Fast Devices and Memories 

The processor can be interfaced to devices and memo- 
ries that complete accesses within one cycle. The chan- 
nel protocol takes maximum advantage of such devices 
and memories by allowing data to be returned to the 
processor during the cycle in which the address is trans- 
mitted. This allows a full range of memory-speed trade- 
offs to be made within a particular system. 

Register File 

An on-chip Register File containing 192 general- 
purpose registers allows most instruction operands to 
be fetched without the delay of an external access. The 
Register File incorporates several features that aid the 
retention of data required by an executing program. 
Because of the number of general-purpose registers, 
the frequency of external references for the Am29000 is 
significantly lower than the frequency of references in 
processors having only 1 6 or 32 registers. 

Triple-port access to the Register File allows two source 
operands to be fetched in one cycle while a previously 
computed result is written. Three 32-bit internal buses 
prevent contention in the routing of operands. All oper- 
and fetches and result write-backs for instruction execu- 
tion can be perfomned in a single cycle. 

The registers allow efficient procedure linkage by cach- 
ing a portion of a compiler's run-time stack. On the aver- 
age, procedure calls and returns can be executed 5 to 
10 times faster (on a cycle-by-cycle basis) than in pro- 
cessors that require the implementation of a run-time 
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stack in external memory (with the attendant loading 
and storing of registers on procedure call and return). 

The registers can contain variables, constants, ad- 
dresses, and operating-system values. In multitasking 
applications, they can be used to hold the processor 
status and variables for as many as eight different tasks. 
A register-banking option allows the Register File to be 
divided into segments, which can be individually pro- 
tected. In this configuration, a task switch can occur in 
asfewas 17cycles. 

Instruction Execution 

The Am29000 uses an Arithmetic/Logic Unit, a Field 
Shift Unit, and a Prioritizerto execute most instructions. 
Each of these is organized to operate on 32-bit oper- 
ands and provide a 32-bit result. All operations are per- 
formed in a single cycle. 

Instruction operations are overlapped with operand 
fetch and result write-back to the Register File. Pipeline 
fonwarding logic detects pipeline dependencies and 
routes data as required, avoiding delays that might arise 
from these dependencies. 

Branch Target Cache 

In general, the Am29000 meets its instruction 
bandwidth requirements via instmction prefetching. 
However, instruction prefetching is ineffective when a 
branch occurs. The Am29000 therefore incorporates an 
on-chip Branch Target Cache to supply instructions for a 
branch — if this branch has been taken previously— 
while a new prefetch stream is established. 

If branch-target instructions are in the Branch Target 
Cache, branches execute in a single cycle. The Branch 
Target Cache in the Am29000 has an average hit rate of 
60%. In other words, it eliminates the branch latency for 
60% of all successful branches on the average. 

Branching 

Branch conditions in the Am29000 are based on 
Boolean data contained in general-purpose registers 
rather than on arithmetic condition codes. Using a con- 
dition-code register for the purpose of branching— 
which is common in other processors— inhibits certain 
compileroptimizations because the condition-code reg- 
ister is modified by many different instructions. It is diffi- 
cult for an optimizing compiler to schedule this shared 
use. By treating branch conditions as any other instruc- 
tion operand, the Am29000 avoids this problem. 

The Am29000 executes branches in a single cycle for 
those cases where the target of the branch is in the 
Branch Target Cache. The single-cycle branch is un- 
usual for a pipelined processor, and is due to processor 
hardware that allows much of the branch instruction op- 
eration to be performed early in the execution of the 
branch. Single-cycle branching has a dramatic effect on 
performance, since successful branches typically repre- 
sent 15% to 25% of a processor's instruction mix. 



The techniques used to achieve single-cycle branching 
also minimize the execution time of branches in those 
cases where the target is not in the Branch Target 
Cache. To keep the pipeline operating at the maximum 
rate, the instruction following the branch, referred to as 
the delay instmction, is executed regardless of the out- 
come of the branch. An optimizing compiler can define a 
useful instmction for the delay instruction in approxi- 
mately 90% of branch instmctions, thereby increasing 
the performance of branches. 

Loads and Stores 

The performance degradation of load and store opera- 
tions is minimized in the Am29000 by overlapping them 
with instruction execution, by taking advantage of 
pipelining, and by organizing the flow of external data 
onto the processor so that the impact of external ac- 
cesses is minimized. 

Overlapped Loads and Stores 

In the Am29000, a load or store is performed concur- 
rently with execution of instmctions that do not have de- 
pendencies on the load or store operation. An optimiz- 
ing compiler can schedule loads and stores in the in- 
stmction sequence so that, in most cases, data ac- 
cesses are overlapped with instruction execution. 

Overlapped load and store operations can achieve up to 
a 30% improvement in performance when data memory 
has a two-cycle access time. Processor hardware de- 
tects dependencies while overlapped loads and stores 
are being performed, so dependencies have no soft- 
ware implications. 

The Am29000 exception restart mechanism automati- 
cally saves information required to restart any load 
or store until the operation is successfully completed. 
Thus, it allows the overlapped execution of loads and 
stores while properly handling address-translation 
exceptions. 

The Am29000 data-flow organization avoids the one- 
cycle penalty that would result from the contention be- 
tween load data and the results of overlapped instmc- 
tion execution. Load data is buffered in a latch while 
awaiting an opportunity to be written into the registerf lie. 
This opportunity is guaranteed to arise before the next 
load is executed. While the data is buffered in this latch, 
it may be used as an instmction operand in place of the 
destination register for the load. 

Load Multiple and Store Multiple 

Load Multiple and Store Multiple instructions allow the 
transfer of the contents of multiple registers to or from 
external memories or devices. This transfer can occur at 
a rate of one register content per cycle. 

The advantage of Load Multiple and Store Multiple is 
best seen in task switching, register-file saving and 
restoring, and in block data moves. In many systems. 
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such operations require a significant percentage of 
execution time. 

The Load Multiple and Store Ivlultiple sequences are in- 
terruptibie so that they do not affect interrupt latency. 

Forwarding of Load Data 

Data that are sent to the processor at the completion of a 
load are forwarded directly to the appropriate execution 
unit if the data are required immediately by an instruc- 
tion. This avoids the common one-cycle delay from bus 
transfer to use of data, and reduces the access latency 
of external data by one cycle. 

Memory Management 

A 64-entry Translation Look-Aside Buffer (TLB) on the 
Am29000 performs virtual-to-physical address trans- 
lation, avoiding the cycle that would be required to trans- 
fer the virtual address to an external TLB. A number of 
enhancements improve the performance of address 
translation: 

1 . Pipelining — ^The operation of the TLB is pipe- 
lined with other processor operations. 

2. Early Address Translation — Address transla- 
tions for load, store, and branch instructions oc- 
cur during the cycle in which these instructions 
are executed. This allows the physical address 
to be transferred externally in the next cycle. 

3. Task Identifiers^Task Identifiers allow TLB en- 
tries to be matched to different processes so that 
TLB invalidation is not required during task 
switches. 

4. Least-Recently Used Hardware — This hard- 
ware allows immediate selection of a TLB set to 
be replaced. 

5. Software Reload— Software reload allows the 
operating system to use a page-mapping 
scheme that is best matched to its environment. 
Paged-segmented, one-level page mapping, 
two-level page mapping, or any other user-de- 
fined page-mapping scheme can be supported. 
Because Am29000 instructions execute at an 
average rate of nearly one instruction per cycle, 
software reload has a performance approaching 
that of hardware TLB reload. 

Interrupts and Traps 

When the Am29000 takes an interrupt ortrap, it does not 
automatically save its current state information. This 
greatly improves the performance of temporary inter- 
ruptions such as TLB reload, floating-point emulation, or 
other simple operating-system calls that require no sav- 
ing of state information. 
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In cases where the processor state must be saved, the 
saving and restoring of state information is under the 
control of software. The methods and data structures 
used to handle intermpts— and the amount of state 
saved— may be tailored to the needs of a particular 
system. 

Interrupts and traps are dispatched through a 256-entry 
Vector Area, which directs the processor to a routine to 
handle a given interrupt ortrap. The Vector Area may be 
relocated in memory by the modification of a processor 
register. There may be multiple Vector Areas in the sys- 
tem, though only one is active at any given time. 

The Vector Area is either a table of pointers to the inter- 
rupt and trap handlers, or a segment of instruction mem- 
ory (possibly read-only memory) containing the han- 
dlers themselves. The choice between the two possible 
Vector Area definitions is determined by the cost/per- 
formance trade-offs made for a particular system. 

If the Vector Area is a table of vectors in data memory, it 
requires only 1 kb of memory. However, this structure 
requires that the processor perform a vector fetch every 
time an interrupt or trap is taken. The vector fetch re- 
quires at least three cycles in addition to the number of 
cycles required for the basic memory access. 

If the Vector Area is a segment of instruction memory, it 
requires a maximum of 64 kb of memory. The advan- 
tage of this structure is that the processor begins the 
execution of the interrupt ortrap handler in the minimum 
amount of time. 

Floating-Point Arithmetic Unit 

The Am29027 is a double-precision, floating-point arith- 
metic unit for the Am29000. It can provide an order-of- 
magnitude performance increase over floating-point op- 
erations performed in software. It performs tjoth single- 
precision and double-precision operations using IEEE 
and other floating-point formats. The Am29027 also 
supports 32- and 64-bit integer functions. 

The Am29027 performs floating-point operations using 
combinatorial— rather than sequential — logic; there- 
fore, operations with the Am29027 require only five 
Am29000 cycles. Floating-point operations may be 
overlapped with other processor operations. Further- 
more, the Am29027 incorporates pipeline registers 
and eight operand registers, permitting very high 
throughput for certain types of operations (such as array 
computations). 

The Am29027 attaches directly to the Am29000 using 
the coprocessor interface. The Am29000 can transfer 
two 32-bit quantities to the Am29027 in one cycle. 

The Am29027 is described in detail in the Am29027 
Arithmetic Accelerator Data Sheet (order# 09114) and 
the Am29027 Handbook (order# 11852). 
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ARCHITECTURE HIGHLIGHTS 

This section introduces the principle architectural ele- 
ments, hardware features, and system interfaces of the 
Am29000. 

Architecture Overview 

This section gives a brief description of the Am29000 
from a programmer's point of view. It introduces the 
processor's program modes, registers, and instructions. 
An overview of the processor's data fonnats and han- 
dling is given. This section also briefly describes inter- 
rupts and traps, memory management, and the 
coprocessor interface. Finally, the Timer Facility and 
Trace Facility are introduced. 

Program Modes 

There are two mutually exclusive modes of program 
execution: the Supervisor mode and the User mode. In 
the Supervisor mode, executing programs have access 
to all processor resources. In the User mode, certain 
processor resources may not be accessed; any at- 
tempted access causes a trap. 

Visible Registers 

The Am29000 incorporates three classes of registers 
that are accessed and manipulated by instructions: 
general-purpose registers, special-purpose registers, 
and Translation Look-Aside Buffer (TLB) registers. (Re- 
fer to the Register Description section for greater detail 
of the register categories.) 

General-Purpose Registers 

The Am29000 has 192 general-purpose registers. With 
a few exceptions, general-purpose registers are not 
dedicated to any special use and are available for any 
appropriate program use. 

Most processor instnjctions are three-address instruc- 
tions. An instruction specifies any three of the 192 regis- 
ters for use in instruction execution. Normally, two of 
these registers contain source operands for the instruc- 
tion, and a third stores the result of the instruction. 

The 1 92 registers are divided into 64 global and 128 lo- 
cal registers. Global registers are addressed with abso- 
lute register numbers, while local registers are ad- 
dressed relative to an internal Stack Pointer. 

For fast procedure calling, a portion of a compiler's run- 
time stack can be mapped into the local registers. Stati- 
cally allocated variables, temporary values, and operat- 
ing-system parameters are kept in the global registers. 

The Stack Pointerfor local registers is mapped to Global 
Register 1 . The Stack Pointer is a full 32-bit virtual ad- 
dress for the top of the run-time stack. 

The general-purpose registers may be accessed in- 
directly, with the register number specified by the con- 
tent of a special-purpose register (see below) rather 
than by an instruction field. Three independent indirect 



register numbers are contained in three separate spe- 
cial-purpose registers. Indirect addressing is accom- 
plished by specifying Global Register as an instruction 
operand or result register. An instruction can specify an 
indirect register access for any or all of the source oper- 
ands or result. 

General-purpose registers may be partitioned into seg- 
ments of 16 registers for the purpose of access protec- 
tion. A register in a protected segment may be accessed 
only by a program executing in the Supervisor mode. An 
attempted access (either read or write) by a User-mode 
program causes a trap to occur. 

Special-Purpose Registers 

The Am29000 contains 27 special-purpose registers. 
These registers provide controls and data for certain 
processor functions. 

Special-purpose registers are accessed by data move- 
ment only. Any special-purpose register can be written 
with the contents of any general-purpose register, and 
any general-purpose register can be written with the 
contents of any special-purpose register. Operations 
cannot be performed directly on the contents of special- 
purpose registers. 

Some special-purpose registers are protected, and can 
be accessed only in the Supervisor mode. This restric- 
tion applies to both read and write accesses. An attempt 
by a User-mode program to access a protected register 
causes a trap to occur. 

The protected special-purpose registers are defined as 
follows: 

1 . Vector Area Base Address— Defines the begin- 
ning of the interrupt/trap Vector Area. 

2. Old Processor Status— Receives a copy of the 
Current Processor Status (see below) when an 
internjpt ortrap is taken. It is later used to restore 
the Current Processor Status on an interrupt 
return. 

3. Current Processor Status— Contains control in- 
formation associated with the currently execut- 
ing process, such as interrupt disables and the 
Supervisor Mode bit. 

4. Configuration — Contains control informa- 
tion that normally varies only from system to 
system, and usually is set only during system 
initialization. 

5. Channel Address—Contains the address asso- 
ciated with an external access, and retains the 
address if the access is not completed success- 
fully. The Channel Address Register, in con- 
junction with the Channel Data and Channel 
Control registers described below, allows the re- 
starting of unsuccessful external accesses. This 
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might be necessary for an access encountering 
a page fault in a demand-paged environment, 
for example. 

6. Channel Data — Contains data associated with a 
store operation, and retains the data if the opera- 
tion is not completed successfully. 

7. Channel Control — Contains control information 
associated with a channel operation, and retains 
this information if the operation is not completed 
successfully. 

8. Register Bank Protect — Restricts access of 
user-mode programs to specified groups of 16 
registers. This facilitates register banking for 
multitasking applications, and protects operat- 
ing system parameters kept in the global regis- 
ters from corruption by user-mode programs. 

9. Timer Counter— Supports real-time control and 
other timing-related functions. 

10. Timer Reload — l^aintains synchronization of 
the Timer Counter. It includes control bits for the 
Timer Facility. 

11. Program Counter — Contains the address of 
the instruction being decoded when an inten^upt 
or trap is taken. The processor restarts this in- 
struction upon interrupt return. 

12. Program Counter 1— Contains the address of 
the instruction being executed when an interrupt 
or trap is taken. The processor restarts this in- 
struction upon intermpt return. 

13. Program Counter 2 — Contains the address of 
the instruction just completed when an interrupt 
or trap is taken. This address is provided for in- 
formation only, and does not participate in an in- 
terrupt return. 

14. MMU Configuration— Allows selection of vari- 
ous memory-management options, such as 
page size. 

15. LRU Recommendation — Simplifies the reload of 
entries in the Translation Look-Aside Buffer 
(TLB) by providing information on the least 
recently used entry of the TLB when a TLB miss 
occurs. 

The unprotected special-purpose registers are defined 
as follows: 

1 . Indirect Pointer C — ^Allows the indirect access of 
a general-purpose register. 

2. Indirect Pointer A — Allows the indirect access of 
a general-purpose register. 

3. Indirect Pointer B— Allows the indirect access of 
a general-purpose register. 
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4. Q— Provides additional operand bits for multiply 
step, divide step, and divide operations. 

5. ALU Status— Contains information alx)ut the 
outcome of integer arithmetic and logical opera- 
tions, and holds residual control for certain in- 
stmction operations. 

6. Byte Pointer — Contains an index of a byte or 
half-word within a word. This register is also ac- 
cessible via the ALU Status Register. 

7. Funnel Shift Count— Provides a bit offset for the 
extraction of word-length fields from double- 
word operands. This register is also accessible 
via the ALU Status Register. 

8. Load/Store Count Remaining — Maintains a 
count of the number of loads and stores remain- 
ing for Load Multiple and Store Multiple opera- 
tions. The count is initialized to the total number 
of loads or stores to be performed before the op- 
eration is initiated. This register is also accessi- 
ble via the Channel Control Register. 

9. Floating-Point Environment — Controls the op- 
eration of floating-point arithmetic, such as 
rounding modes and exception reporting. 

1 0. Integer Environment— Enables and disables the 
reporting of exceptions that occur during integer 
multiply and divide operations. 

11. Floating-Point Status— Contains information 
about the outcome of floating-point operations. 

1 2. Exception Opcode — Reports the operation code 
of an instruction causing a trap. This register is 
provided primarily for recovery from floating- 
point exceptions, but is also set for other instruc- 
tions that cause traps. 

TLB Registers 

Translation Look-Aside Buffer (TLB) entries in the 
Am29000 Memory Management Unit are accessed via 
128 TLB registers. A single TLB entry appears as two 
TLB registers; TLB registers are thus paired according 
to the corresponding TLB entry. 

TLB registers are accessed by data nrvDvement only. 
Any TLB register can be written with the contents of any 
general-purpose register, and any general-purpose reg- 
ister can be written with the contents of any TLB register. 
Operations cannot be performed directly on the 
contents of TLB registers. 

TLB registers can be accessed only in the Supervisor 
mode. This restriction applies to both read and write ac- 
cesses. An attempt by a User-mode program to access 
a TLB register causes a trap to occur. 
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3. Logical— Perform a set of bit-wise Boolean op- 
erations. 

4. Shift— Perform arithmetic and logical shifts, and 
allow the extraction of 32-bit words from 64-bit 
double words. 

5. Data Movement— Perform movement of data 
fields between registers, and the movement 
of data to and from external devices and 
memories. 

6. Constant— Allow the generation of large con- 
stant values in registers. 

7. Floating-Point — Included for floating-point arith- 
metic, comparisons, and format conversions. 
These instructions are not currently imple- 
mented directly in processor hardware. 

8. Branch— Perform program jumps and subrou- 
tine calls. 

9. f^iscellaneous- Perform miscellaneous control 
functions and operations not provided by other 
classes. 

The Am29000 executes all instructions in a single cycle, 
except for interrupt returns. Load Multiple, and Store 
Multiple. 

Figure 1 shows a complete list of Am29000 instructions, 
listed alphabetically by instruction mnemonic (refer to 
the Instruction Set section for more details). 



Instruction Set Overview 

The three-address architecture of the Am29000 instruc- 
tion set allows a compiler or assembly-language pro- 
grammer to prevent the destruction of operands, and 
aids register allocation and operand reuse. Instmction 
operands may be contained in any 2 of the 192 general- 
purpose registers, and instruction results may be stored 
in any of the 192 general-purpose registers. 

The compiler or assembly-language programmer has 
complete freedom to allocate register usage. There is 
no dedication of a particular register or register group to 
a particularclass of operations. The instmction set is de- 
signed to minimize the number of side effects and 
implicit operations of instructions. 

Most Am29000 instructions can specify an 8-bit con- 
stant as one of the source operands. Larger constants 
are constructed using one or two additional instmctions 
and a general-purpose register. Relative branch instruc- 
tions specify a 16-bit, signed, word offset. Absolute 
branches specify a 1 6-bit word address. 

The Am29000 instruction set contains 1 1 7 instructions. 
These instructions are divided into nine classes: 

1 . Integer Arithmetic — Perform integer add, sub- 
tract, multiply, and divide operations. 

2. Compare — Perfonn arithmetic and logical com- 
parisons. Some instmctions in this class allow 
the generation of a trap if the comparison condi- 
tion is not met. 
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Mnemonic 


Instruction Name 


ADD 


Add 


ADDC 


Add with Carry 


ADDCS 


Add with Carry, Signed 


ADDCU 


Add with Carry, Unsigned 


ADDS 


Add, Signed 


ADDU 


Add, Unsigned 


AND 


AND Logical 


ANDN 


AND-NOT Logical 


ASEQ 


Assert Equal To 


ASGE 


Assert Greater Than or Equal To 


ASGEU 


Assert Greater Than or Equal To, Unsigned 


ASGT 


Assert Greater Than 


ASGTU 


Assert Greater Than, Unsigned 


ASLE 


Assert Less Than or Equal To 


ASLEU 


Assert Less Than or Equal To, Unsigned 


ASLT 


Assert Less Than 


ASLTU 


Assert Less Than, Unsigned 


ASNEQ 


Assert Not Equal To 


CALL 


Call Subroutine 


CALLI 


Call Subroutine, Indirect 


CLASS 


Classify Floating-Point Operand 


CLZ 


Count Leading Zeros 


CONST 


Constant 


CONSTH 


Constant, High 


CONSTN 


Constant, Negative 


CONVERT 


Convert Data Format 


CPBYTE 


Compare Bytes 


CPEQ 


Compare Equal To 


CPGE 


Compare Greater Than or Equal To 


CPGEU 


Compare Greater Than or Equal To, Unsigned 


CPGT 


Compare Greater Than 


CPGTU 


Compare Greater Than, Unsigned 


CPLE 


Compare Less Than or Equal To 


CPLEU 


Compare Less Than or Equal To, Unsigned 


CPLT 


Compare Less Than 


CPLTU 


Compare Less Than, Unsigned 


CPNEQ 


Compare Not Equal To 


DADD 


Floating-Point Add, Double-Precision 


DDIV 


Floating-Point Divide, Double-Precision 


DEO 


Floating-Point Equal To, Double-Precision 


DGE 


Floating-Point Greater Than or Equal To, Double-Precision 


DGT 


Floating-Point Greater Than. Double-Precision 


DIV 


Divide Step 


DIVO 


Divide Initialize 


DIVIDE 


Integer Divide, Signed 


DIVIDU 


Integer Divide, Unsigned 


DIVL 


Divide Last Step 


DIVREM 


Divide Remainder 


DMUL 


Floating-Point Multiply, Double-Precision 


DSUB 


Floating-Point Subtract, Double-Precision 


EMULATE 


Trap to Software Emulation Routine 


EXBYTE 


Extract Byte 


EXHW 


Extract Half-Word 


EXHWS 


Extract Half-Word, Sign-Extended 


EXTRACT 


Extract Word, Bit-Aligned 


FADD 


Floating-Point Add, Single-Precision 


FDIV 


Floating-Point Divide, Single-Precision 


FDMUL 


Floating-Point Multiply, Single-to-Double Precision 


FEQ 


Floating-Point Equal To, Single-Precision 


FGE 


Floating-Point Greater Than or Equal To, Single-Precision 



Figure 1. Am29000 Instruction Set 
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Mnemonic 


Instruction Name 


FGT 


Floating-Point Greater Than, Single-Precision 


FMUL 


Floating-Point Multiply, Single-Precision 


FSUB 


Floating-Point Subtract, Single-Precision 


HALT 


Enter Halt Mode 


INBYTE 


Insert Byte 


INHW 


Insert Half-Word 


INV 


Invalidate 


IRET 


Interrupt Return 


IRETINV 


Interrupt Return and Invalidate 


JMP 


Jump 


JMPF 


Jump False 


JMPFDEC 


Jump False and Decrement 


JMPFI 


Jump False Indirect 


JMPI 


Jump Indirect 


JMPT 


Jump True 


JMPTI 


Jump True Indirect 


LOAD 


Load 


LOADL 


Load and Lock 


LOADM 


Load Multiple 


LOADSET 


Load and Set 


MFSR 


Move from Special Register 


MFTLB 


Move from Translation Look-Aside Buffer Register 


MTSR 


Move to Special Register 


MTSRIM 


Move to Special Register Immediate 


MTTLB 


Move to Translation Look-Aside Buffer Register 


MUL 


Multiply Step 


MULL 


Multiply Last Step 


MULTIPLU 


Integer Multiply, Unsigned 


MULTIPLY 


Integer Multiply, Signed 


MULTM 


Integer Multiply Most-Significant Bits, Signed 


MULTMU 


Integer Multiply Most-Significant Bits, Unsigned 


MULU 


Multiply Step, Unsigned 


NAND 


NAND Logical 


NOR 


NOR Logical 


OR 


OR Logical 


SETIP 


Set Indirect Pointers 


SLL 


Shift Left Logical 


SORT 


Square Root 


SRA 


Shift Right Arithmetic 


SRL 


Shift Right Logical 


STORE 


Store 


STOREL 


Store and Lock 


STOREM 


Store Multiple 


SUB 


Subtract 


SUBC 


Subtract with Carry 


SUBCS 


Subtract w^ith Carry, Signed 


SUBCU 


Subtract with Carry, Unsigned 


SUBR 


Subtract Reverse 


SUBRC 


Subtract Reverse with Carry 


SUBRCS 


Subtract Reverse with Carry, Signed 


SUBRCU 


Subtract Reverse with Carry, Unsigned 


SUBRS 


Subtract Reverse, Signed 


SUBRU 


Subtract Reverse, Unsigned 


SUBS 


Subtract Signed 


SUBU 


Subtract Unsigned 


XNOR 


Exclusive-NOR Logical 


XOR 


Exclusive-OR Logical 



Figure 1. Am29000 Instruction Set (continued) 
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Data Formats and Handling 

This section introduces the data formats and data- 
manipulation mechanisms that are supported by the 
Am29000. 

Data Types 

A word is defined as 32 bits of data. A half-word consists 
of 16 bits, and a double word consists of 64 bits. Bytes 
are 8 bits in length. The Am29000 has direct support 
for word-integer (signed and unsigned), word-logical, 
word-Boolean, half-word integer (signed and unsigned), 
and character (signed and unsigned) data. 

Other data types, such as character strings, are sup- 
ported with sequences of basic instructions and/or ex- 
ternal hardware. Single- and double-precision floating- 
point types are defined for the Am29000, but are not 
supported directly by hardware. 

The format for Boolean data used by the processor is 
such that the Boolean values TRUE and FALSE are rep- 
resented by 1 and 0, respectively, in the most-significant 
bit of a word. 

Figure 2 illustrates the numbering conventions for data 
units contained in a word. Within a word, bits are num- 
bered in increasing order from right to left, starting with 
the number for the least-significant bit. Bytes and half- 
words within a word are numbered in increasing order, 
starting with the number 0. However, bytes and half- 
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words may be numbered right-to-left or left -to-right, as 
controlled by the Configuration Register. 

Note that the numbering of bits within words is strictly for 
notational convenience. In contrast, the numbering con- 
ventions for bytes and half-words within words affect 
processor operations. 

External Data Accesses 

External accesses move data between the processor 
and external devices and memories. These accesses 
occur only as a result of load and store instructions. 

Load and store instructions move words of data to and 
from general-purpose registers. Each load and store in- 
struction moves a single word. There are load and store 
instructions that support interlocking operations neces- 
sary for multiprocessor exclusion, synchronization, and 
communication. 

For the movement of multiple words. Load I^ultiple and 
Store l^ultiple instructions move the contents of se- 
quentially addressed external locations to or from se- 
quentially numbered general-purpose registers. The 
Load f^ultiple and Store Multiple allow the movement of 
up to 192 words at a maximum rate of one word per 
processor cycle. The multiple load and store sequences 
may be interrupted, and restarted at the point of 
interruption. 
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Figure 2. Data-Unit Numbering Conventions 
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Load and store instructions provide no mechanism for 
computing the address associated with the external 
data access. All addresses are contained in a general- 
purpose register at the beginning of the access, or are 
given by an 8-bit instmction constant. Any address com- 
putation must be performed explicitly before the load or 
store instruction is executed. Since address computa- 
tions are expressed directly, they are exposed for 
compiler optimizations as any other computations are. 

External data accesses are overlapped with instruction 
execution. Processor perfomiance is improved if in- 
structions that follow loads do not immediately use ex- 
ternally referenced data. In this manner, the time re- 
quired to perform the external access is overlapped with 
subsequent instruction execution. Because of hardware 
interlocks, this concurrency has no effect on the logical 
behavior of an executing program. 

Addressing and Alignment 

External instructions and data are contained in one of 
four 32-bit address spaces: 

1 . Instruction/Data Memory 

2. Input/Output 

3. Coprocessor 

4. Instruction Read-Only Memory (Instmction 
ROM) 

An address in the instruction/data memory address 
space may be treated as virtual or physical, as deter- 
mined by the Current Processor Status Register. Ad- 
dress translation for data accesses is enabled sepa- 
rately from address translation for instruction accesses. 
A program in the Supervisor mode may temporarily dis- 
able address translation for individual loads and stores; 
this permits load-real and store-real operations. 

Bits contained within load and store instructions distin- 
guish between the instruction/data memory, input/out- 
put, and coprocessor address spaces. Address transla- 
tion also may determine whether an access is per- 
formed in the instruction/data memory or the input/out- 
put address space. The Current Processor Status regis- 
ter determines whether instruction accesses are di- 
rected to the instmction/data memory address space or 
to the instruction ROM address space. 

The Am29000 does not support data accesses directly 
to the instmction ROM address space. However, this 
capability is possible as a system option. 

All addresses are interpreted as byte addresses, al- 
though accesses are word-oriented. The number of a 
byte within a word is given by the two least-significant 
address bits. The number of a half-word within a word is 
given by the next-to-least-significant address bit. 

Since only byte addressing is supported, it is possible 
that an address for the access of a word or half-word is 
not aligned to the desired word or half-word. For a word 
access, an unaligned address has a 1 in either or both of 
the two least-significant address bits. For a half-word 
access, an unaligned address has a 1 in the least-sig- 
nificant address bit. In many systems, address align- 
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ment can be ignored, with addresses tmncated to ac- 
cess the word or half-word of interest. However, as a 
user option, the Am29000 creates a trap when a non- 
aligned access is attempted. The trap allows software 
emulation of nonaligned accesses. 

In the Am29000, all instmctions are 32 bits in length, and 
are aligned on word-address txjundaries. 

Byte and IHaif-Word Accesses 

The Am29000 supports the direct external access of 
bytes and half-words as an option. If this option is en- 
abled, the Am29000 selects a byte or half-word within a 
word on a load, and aligns it to the low-order byte or half- 
word of a register. On a store, the low-order byte or half- 
word of a register is replicated in all byte or half-word po- 
sitions, so that the external memory can easily write the 
required byte or half-word in memory. This option re- 
quires that the external memory system be able to write 
individual bytes and half-words within words. 

To avoid the memory-system complexity caused by 
writing individual bytes and half-words, the Am29000 
can perform byte and half-word accesses using soft- 
ware alone. The Am29000 can set a byte-position 
indicator in the ALU Status Register as an option for load 
instmctions, with the two least-significant bits of the 
address for the load. To load a byte or half-word, a word 
load is first performed. This load sets the byte-position 
indicator, and a subsequent instruction extracts the byte 
or half-word of interest from the accessed word. To store 
a byte or half-word, a load is also first performed; the 
byte or half-word of interest is inserted into the accessed 
word, and the resulting word then is stored. Even if 
the Am29000 is configured to perform byte and 
half-word accesses in hardware, this software-only 
technique operates correctly; this allows software to be 
upwardly compatible from simpler systems to more 
complex systems. 

Interrupts and Traps 

Normal program flow may be preempted by an intermpt 
or trap for which the processor is enabled. The effect on 
the processor is identical for intermpts and traps; the 
distinction is in the different mechanisms by which inter- 
mpts and traps are enabled. It is intended that intermpts 
be used for suspending current program execution and 
causing another program to execute, while traps are 
used to report errors and exceptional conditions. 

The intermpt and trap mechanism supports high-speed, 
temporary context switching and user-defined intermpt- 
processing mechanisms. 

Temporary Context Switching 

The basic intermpt/trap mechanism of the Am29000 
supports temporary context switching. During the tem- 
porary context switch, the intermpted context is held in 
processor registers. The intermpt ortrap handlercan re- 
turn immediately to this context. 

Temporary context switching is useful for instmction 
emulation, floating-point operations, TLB reload rou- 



tines, and so forth. Many of its features are similar to 
microprogram execution; processor context does not 
have to be saved, interrupts are disabled forthe duration 
of the program, and all processor resources are acces- 
sible, even if the context that was interrupted is in the 
User mode. The associated routine may execute from 
instruction/data menrwry or instruction ROM. 

User-Defined Interrupt Processing 

Since the basic interrupt/trap mechanism for the 
Am29000 keeps the interrupted context in the pro- 
cessor, dynamically nested intermpts are not supported 
directly. The context in the processor must be saved 
before another interrupt or trap can be taken. 

The interrupt or trap handler executing during a tempo- 
rary context switch is not required to return to the in- 
termpted context. This routine optionally may save the 
interrupted context, load a new one, and return to the 
new context. 

The implementation of the saving and restoring of con- 
texts is completely user-defined. Thus, the context 
save/restore mechanism used (e.g., intermpt stack, 
program status word area, etc.) and the amount of con- 
text saved may be tailored to the needs of the system. 

Vector Area 

Interrupt and trap dispatching occur through a 
relocatable Vector Area, which accommodates as many 
as 256 interaipt and trap handling routines. Entries into 
the Vector Area are associated with various sources of 
interrupts and traps; some are predefined while others 
are user-defined. 

The Vector Area is either a table of vectors in data mem- 
ory where each vector points to the beginning of an in- 
terrupt or trap handler, or it is a segment of instruction/ 
data memory (or instmction ROM) containing the actual 
routines. The latter configuration for the Vector Area 
yields better interrupt performance with the cost of addi- 
tional menwry. 

Memory Management 

The Am29000 incorporates a Memory Management 
Unit (MMU) that accepts a 32-bit virtual byte address 
and translates it to a 32-bit physical byte address in a 
single cycle. The MMU is not dedicated to any particular 
address-translation architecture. 

Address translation in the MMU is perfomied by a 
64-entry Translation Look-Aside Buffer (TLB), an asso- 
ciative table containing the most recently used address 
translations for the processor. If the translation for a 
given address cannot be performed by the TLB, a TLB 
miss occurs and causes a trap that allows the required 
translation to be placed into the TLB. 

Processor hardware maintains information for each 
TLB line indicating which entry was least recently used; 
when a TLB miss occurs, this information is used to 
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indicate the TLB entry to be replaced. Software is 
responsible forsearching system page tables and modi- 
fying the indicated TLB entry as appropriate. This allows 
the page tables to be defined according to the system 
environment. 

TLB entries are modified directly by processor instruc- 
tions. A TLB entry consists of 64 bits and appears as two 
word-length TLB registers, which may be inspected and 
modified by instmctions. 

TLB entries are tagged with a Task Identifier field, which 
allows the operating system to create a unique 32-bit vir- 
tual address space for each of 256 processes. In addi- 
tion, TLB entries provide support for memory protection 
and user-defined control information. 

Coprocessor Programming 

The coprocessor interface for the Am29000 allows a 
program to communicate with an off-chip coprocessor 
for performing operations not supported by processor 
hardware directly. 

The coprocessor interface allows the program to trans- 
fer operands and operation codes to the coprocessor, 
and then perform other operations while the coproces- 
sor operation is in progress. The results of the operation 
are read from the coprocessor by a separate transfer. 
The processor may transfer multiple operands to the 
coprocessor without retransferring operation codes or 
reading intermediate results. As many as 64 bits of in- 
formation can be transferred to the coprocessor in a 
single cycle. 

The Am29000 includes features that support the defini- 
tion of the coprocessor as a system option. In this case, 
coprocessor operations are emulated by software when 
the coprocessor is not present in a system. 

Timer Facility 

The Timer Facility provides a counterfor implementing a 
real-time clock or other software timing functions. This 
facility comprises two special-purpose registers: the 
Timer Counter Register, which decrements at a rate 
equal to the processor operating frequency, and the 
Timer Reload Register, which reinitializes the Timer 
Counter Register when it decrements to 0. The Timer 
Facility optionally may create an interrupt when the 
Timer Counter decrements to 0. 

Trace Facility 

The Trace Facility allows a debug program to emulate 
single-instaiction stepping in a program under test. This 
facility allows a trap to be generated after the execution 
of any instruction in the program being tested. 

Using the Trace Facility, the debug program can inspect 
and modify the state of the program at every instruction 
boundary. The Trace Facility is designed to work 
property in the presence of normal system interrupts 
and traps. 
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FUNCTIONAL OPERATION 

This section briefly describes the operation of Am29000 
hardware. It introduces the processor pipeline and the 
three major internal functional units: the Instruction 
Fetch Unit, the Execution Unit, and the Memory Man- 
agement Unit. Finally, the processor's operational 
modes are described. 

Four-Stage Pipeline 

The Am29000 implements a four-stage pipeline for in- 
struction execution. The four stages are: fetch, decode, 
execute, and write-back. The pipeline is organized so 
that the effective instruction execution rate is as high as 
one instruction per cycle. Data fonwarding and pipeline 
interlocks are handled by processor hardware. 

Fetch Stage 

During the fetch stage, the Instruction Fetch Unit 
determines the location of the next processor instruction 
and issues the instruction to the decode stage. The in- 
struction is fetched either from the Instruction Prefetch 
Buffer, the Branch Target Cache, or an external 
instruction memory. 

Decode Stage 

During the decode stage, the Execution Unit decodes 
the instruction selected during the fetch stage and 
fetches and/or assembles the required operands. It also 
evaluates addresses for branches, loads, and stores. 

Execute Stage 

During the execute stage, the Execution Unit performs 
the operation specified by the instaiction. In the case of 
branches, loads, and stores, the Memory Management 
Unit performs address translation if required. 

Write-Back Stage 

During the write-back stage, the results of the operation 
performed during the execute stage are stored. In the 
case of branches, loads, and stores, the physical ad- 
dress resulting from translation during the execute 
stage is transmitted to an external device or memory. 

Function Organization 

Figure 3 shows the Am29000 internal data-flow organi- 
zation. The following sections refer to the various com- 
ponents on this data-flow diagram. 

Instruction Fetch Unit 

The Instruction Fetch Unit fetches instructions and sup- 
plies instructions to other functional units. It incorpo- 
rates the Instnjction Prefetch Buffer, the Branch Target 
Cache, and the Program Counter Unit. All components 
of the Instruction Fetch Unit operate during the fetch 
stage of the processor pipeline. 

Instruction Prefetch Buffer 

Most instructions executed by the Am29000 are fetched 
from external instmction/data menrory. The processor 



prefetches instructions so that they are requested at 
least four cycles before they are required for execution. 

Prefetched instructions are stored in a four-word In- 
struction Prefetch Buffer while awaiting execution. An 
instruction prefetch request occurs whenever there is a 
free location in this buffer (if the processor is othenwise 
enabled to fetch instructions). When a nonsequential in- 
struction fetch occurs, prefetching is terminated, and 
then restarted for the new instruction stream. 

Instruction prefetching uncouples the instruction fetch 
rate from the instmction access latency. For example, 
an instruction may be transferred to the processor two 
cycles after it is requested. However, as long as instruc- 
tions are supplied to the processor at an average rate of 
one instruction per cycle, this latency has no effect on 
the instruction execution rate. 

Branch Target Cache 

The Am29000 incorporates a Branch Target Cache that 
contains as many as 128 instructions. The Branch Tar- 
get Cache is a two-way, set-associative cache contain- 
ing the first four target instmctions of a number of re- 
cently taken branches. Each of the two sets in the 
Branch Target Cache contains 64 instmctions, and the 
64 instructions are further divided into 1 6 blocks of 4 in- 
stmctions each. 

The purpose of the Branch Target Cache is to provide 
instmctions for the beginning of a nonsequential in- 
stmction-felch sequence. This keeps the instmction 
pipeline full until the processor can establish a new in- 
stmction prefetch stream from the external instmction/ 
data memory. 

The processor is organized so that branch instmctions 
can execute in a single cycle if the target instmction se- 
quence is present in the Branch Target Cache. 

Program Counter Unit 

The Program Counter Unit creates and sequences 
addresses of instmctions as they are executed by the 
processor. 

Execution Unit 

The Execution Unit executes instructions. It incorpo- 
rates the Register File, the Address Unit, the Arithmetic/ 
Logic Unit, the Field Shift Unit, and the Prioritizer. The 
Register File and Address Unit operate during the de- 
code stage of the pipeline. The Arithmetic/Logic Unit, 
Field Shift Unit, and Prioritizer operate during the exe- 
cute stage of the pipeline. The Register File operates 
during the write-back stage. 

Register File 

The general-purpose registers are implemented by a 
192-location Register File. The Register File can per- 
form two read accesses and one write access in a single 
cycle. Normally, two read accesses are performed dur- 
ing the decode-pipeline stage to fetch operands re- 
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quired by the instruction being decoded. Thie write ac- 
cess during the same cycle completes the write-back 
stage of a previously executed instruction. 

Addressing logic associated with the Register File dis- 
tinguishes between the global and local general- 
purpose registers, and it performs the Stack-Pointer ad- 
dressing for the local registers. Register File addressing 
functions are performed during the decode stage. 

Address Unit 

The Address Unit evaluates addresses for branches, 
loads, and stores. It also assembles instruction-immedi- 
ate data and computes addresses for Load Multiple and 
Store Multiple sequences. 

Arithmetic/Logic Unit 

The ALU performs all logical, compare, and arithmetic 
operations (including multiply step and divide step). 

Field Shift Unit 

The Field Shift Unit performs N-bit shifts. The Field Shift 
Unit also performs byte and half-word extract and insert 
operations, and it extracts words from double words. 



Prioritizer 

The Prioritizer provides a count of the number of leading 
bits in a 32-bit word; this is useful for performing float- 
ing-point normalization, for example. It can also 
be used to implement prioritization in a multilevel 
interrupt handler. 

Memory Management Unit 

The Memory Management Unit (MMU) performs ad- 
dress translation and memory-protection functions for 
all branches, loads, and stores. The MMU operates dur- 
ing the execute stage of the pipeline, so the physical ad- 
dress that it generates is available at the beginning of 
the write-back stage. 

All addresses for external accesses are physical ad- 
dresses. MMU operation is pipelined with external ac- 
cesses, so that an address translation can occur while a 
previous access is being completed. 

Address translation is not performed for the addresses 
associated with instruction prefetching. Instead, these 
addresses are generated by an instmction prefetch 
pointer that is incremented by the processor. Address 
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translation is performed only at thie beginning of the 
prefetch) sequence (as Xhe result of a branch instruc- 
tion), and when the prefetch pointer crosses a potential 
virtual-page boundary. 

Processor Modes 

The Am29000 operates in several different modes to 
accomplish various processor and system functions. AH 
modes except for Pipeline l-iold (see below) are under 
direct control of instructions and/or processor control 
inputs. The Pipeline Hold mode normally is determined 
by the relative timing between the processor and its 
external system for certain types of operations. The 
processor provides an external indication of its 
operational nwde. 

Executing 

When the processor is in the Executing mode, it fetches 
and executes instructions as described in this manual. 
External accesses occur as required. 

Walt 

When the processor is in the Wait mode, it does not exe- 
cute instructions and it performs no external accesses. 
The Wait mode is controlled by the Current Processor 
Status Register. The processor leaves this mode when 
an interrupt or trap for which it is enabled occurs, or 
when a reset occurs. 

Pipeline Hold 

Under certain conditions, processor pipelining might 
cause nonsequential instruction execution or timing-de- 
pendent results of execution. For example, the proces- 
sor might attempt to execute an instruction that has not 
been fetched from instruction/data memory. 

For such cases, pipeline-interlocl< hardware detects the 
anomalous condition and suspends processor execu- 
tion until execution can proceed properly. While execu- 
tion is suspended by the interlock hardware, the proces- 
sor is in the Pipeline Hold mode. The processor re- 
sumes execution when the pipeline-interlock hardware 
determines that it is correct to do so. 

Halt 

The Halt mode is provided so that the processor may be 
placed under the control of the ADAPT29K or other 
hardware-development system for the purposes of 
hardware and software debugging. The processor en- 
ters the Halt mode as the result of instruction execution, 
or as the result of external controls. In the Halt mode , the 
processor neither fetches nor executes instructions. 

Step 

The Step mode allows the ADAPT29K or other hard- 
ware-development system to step through processor 
pipeline operation on a stage-by-stage basis. The Step 
mode is nearly identical to the Halt mode, except that it 
enables the processor to enter the Executing mode 
while the pipeline advances by one stage. 



Load Test Instruction 

The Load Test Instruction mode permits the ADAPT29K 
or other hardware-development system to access data 
contained in the processor or system. This is accom- 
plished by allowing the ADAPT29K to supply the pro- 
cessor with instructions, instead of having the processor 
fetch instructions from instruction/data memory. The 
Load Test Instmction mode is defined so that, once the 
processor has completed the execution of instmctions 
provided by the ADAPT29K, it may resume the execu- 
tion of its normal instmction sequence. 

Test 

The Test mode facilitates testing of hardware associ- 
ated with the processor by disabling processor outputs 
so that they may be driven directly by test hardware. The 
Test mode also allows the addition of a second proces- 
sor to a system to monitor the outputs of the first and to 
signal detected errors. 

Reset 

The Reset mode provides initialization of certain pro- 
cessor registers and control state. This is used for 
power-on reset, for eliminating unrecoverable errorcon- 
ditions, and for supporting certain hardware debugging 
functions. 

System Interface 

This section briefly describes the features of the 
Am29000 that allow it to be connected to other system 
components. 

The two major interfaces of the Am29000, introduced in 
this section, are the channel and the Test/Development 
interfaces. The other topics briefly described here are 
clock generation, master/slave checking, and coproces- 
sor attachment. ,( 

Channel 

The Am29000 channel consists of the following 32-bit 
buses and related controls: 

1 . An Instruction Bus, which transfers instructions 
into the processor 

2. A Data Bus, which transfers data to and from the 
processor 

3. An Address Bus, which provides addresses for 
t)oth instmction and data accesses. The ad- 
dress bus also is used to transfer data to a 
coprocessor. 

The channel performs accesses and data transfers to all 
external devices and memories, including instmction/ 
data memories, instmction caches, instmction read- 
only memories, data caches, input/output devices, bus 
converters, and coprocessors. 
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The channel defines three different access protocols: 
simple, pipelined, and burst-mode. For simple 
accesses, the Am29000 holds the address valid 
throughout the entire access. This is appropriate for 
high-speed devices that can complete an access in one 
cycle, and for low-cost devices that are accessed in- 
frequently (such as read-only memories containing 
initialization routines). Pipelined and burst-mode 
accesses provide high performance with other types of 
devices and memories. 

For pipelined accesses, the address transfer is uncou- 
pled from the corresponding data or instruction transfer. 
After transmitting an address for a request, the proces- 
sor may transmit one more address before receiving the 
reply to the first request. This allows address transfer 
and decoding to be overlapped with another access. 

On the other hand, burst-mode accesses eliminate the 
address-transfer cycle completely. Burst-mode ac- 
cesses are defined so that once an address is trans- 
ferred for a given access, subsequent accesses to se- 
quentially increasing addresses may occur without re- 
transfer of the address. The burst may be terminated at 
any time by either the processor or responding device. 

The Am29000 determines whether an access is simple, 
pipelined, or burst-mode on a transfer-by-transfer (i.e., 
generally device-by-device) basis. However, an access 
that begins as a simple access may be converted to a 
pipelined or burst-mode access at any time during the 
transfer. This relaxes the timing constraints on the chan- 
nel-protocol implementation, since addressed devices 
do not have to respond immediately to a pipelined or 
burst-mode request. 

Except for the shared address bus, the channel main- 
tains a strict division between instruction and data 
accesses. In the most common situation, the system 
supplies the processor with instructions using burst- 
mode accesses, with instruction addresses transmitted 
to the system only when a branch occurs. Data ac- 
cesses can occur simultaneously without interfering 
with instruction transfer. 

The Am29000 contains arbitration logic to support other 
masters on the channel. A single external master can ar- 
bitrate directly for the channel, while multiple masters 
may arbitrate using a daisy chain or other method that 
requires no additional arbitration logic. However, to in- 
crease arbitration performance in a multiple-master 
configuration, an external channel arbiter should be 
used. This arbiter works in conjunction with the proces- 
sor's arbitration logic. 

Test/Development Interface 

The Am29000 supports the attachment of the 
ADAPT29K or other hardware-development system. 
This attachment is made directly to the processor in the 
system under development, without the removal of the 
processor from the system. The Test/Development In- 
terface makes it possible for the hardware-development 
system to gain control over the Am29000, and inspect 
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and modify its internal state (e.g., general-purpose reg- 
ister contents, TLB entries, etc.). In addition, the 
Am29000 may be used to access other system devices 
and memories on behalf of the hardware-development 
system. 

The Test/Development Interface is made up of controls 
and status signals provided on the Am29000, as well as 
the instruction and data buses. The Halt, Step, Reset, 
and Load Test Instruction modes allow the hardware- 
development system to control the operation of the 
Am29000. The hardware-development system may 
supply the processor with instmctions on the instruction 
bus using the load test instruction mode. The internal 
processor state can be inspected and modified via the 
data bus. 

Clocks 

The Am29000 generates and distributes a system clock 
at its operating frequency. This clock is specially de- 
signed to reduce skews between the system clock and 
the processor's internal clocks. The internal clock-gen- 
eration circuitry requires a single-phase oscillator signal 
at twice the processor operating frequency. 

For systems in which processor-generated clocks are 
not appropriate, the Am29000 also can accept a clock 
from an external clock generator. 

The processor decides between these two clocking 
arrangements based on whether the power supply to 
the clock-output driver (PWRCLK) is tied to -i-5 volts or to 
Ground. 

Master/Slave Operation 

Each Am29000 output has associated logic that com- 
pares the signal on the output with the signal that the 
processor is providing internally to the output driver. The 
processor signals situations where the output of any en- 
abled driver does not agree with its input. 

For a single processor, the output comparison detects 
short circuits in output signals, but does not detect open 
circuits. It is possible to connect a second processor in 
parallel with the first, where the second processor has 
its outputs disabled due to the Test mode. The second 
processor detects open-circuit signals, as well as pro- 
vides a check of the outputs of the first processor. 

Coprocessor Attachment 

A coprocessor for the Am29000 attaches directly to the 
processor channel. However, this attachment has fea- 
tures that are different from those of other channel de- 
vices. The coprocessor interface is designed to support 
a high operand transfer rate and to support the overlap 
of coprocessor operations with other processor opera- 
tions, including other external accesses. 

The coprocessor is assigned a special address space 
on the channel. This permits the transfer of operands 
and other information on the address bus without inter- 
fering with normal addressing functions. Since both the 
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address bus and data bus are used for data transfer, the 
Am29000 can transfer 64 bits of information to thie 
coprocessor in one cycle. 

Program Modes 

All system-protection features of ttie Am29000 are 
based on two mutually exclusive program nx>des: the 
Supervisor mode and the User mode. Memory pro- 
tection in the Memory Management Unit is also based 
on the Supervisor and User modes (see Memory 
Management section). 

Supervisor Mode 

The processor is in the Supervisor mode whenever the 
Supervisor Mode (SM) bit of the Cun'ent Processor 
Status Register (see Register Description section) is 1 . 
In this mode, executing programs have access to all 
processor resources. 

During the address cycle of a channel request, the 
Supervisor mode is indicated by the SUP/US output be- 
ing High. 

User Mode 

The processor is in the User mode wheneverthe SM bit 
in the Current Processor Status Register is 0. In this 
mode, any of the following actions by an executing pro- 
gram causes a Protection Violation trap to occur: 

1 . An attempted access of any TLB entry. 

2. An attempted access of any general-purpose 
register for which a bit in the Register Bank Pro- 
tect Register is 1 . 



3. An attempted execution of a load or store in- 
struction for which the PA bit is 1 , or for which the 
UA bit is 1 . (The attempted execution of a trans- 
lated load or store for which the AS bit is 1 also 
causes a Protection Violation trap. However, 
this trap occurs regardless of whether or not the 
processor is in the User mode.) 

4. An attempted execution of one of the following 
instructions: Interrupt Return, Interrupt Return 
and Invalidate, Invalidate, or Halt. However, a 
hardware-development system such as the 
ADAPT29K can disable protection checking for 
the Halt instmction, so this instruction may be 
used to implement instruction breakpoints in 
User-mode programs. 

5. An attempted access of one of the following reg- 
isters: Vector Area Base Address, Old Proces- 
sor Status, Current Processor Status, Configu- 
ration, Channel Address, Channel Data, Chan- 
nel Control, Register Bank Protect, Timer 
Counter, Timer Reload, Program Counter 0, 
Program Counter 1, Program Counter 2, MMU 
Configuration, or LRU Recommendation. 

6. An attempted execution of an assert or Emulate 
instruction that specifies a vector number be- 
tween and 63, inclusive. 

Devices and menraries on the channel also can imple- 
ment protection and generate traps based on the value 
of the SM bit. During the address cycle of a channel re- 
quest, the User mode is indicated by the SUP/US output 
being Low. 
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REGISTER DESCRIPTION 

The Am29000 has three classes of registers that are 
accessible by instructions. These are general-purpose 
registers, special-purpose registers, and Translation 
Look-Aside Buffer (TLB) registers. Any operation avail- 
able in the Am29000 can be performed on the general- 
purpose registers, while special-purpose registers and 
TLB registers are accessed only by explicit data move- 
ment to or from general-purpose registers. Various pro- 
tection mechanisms prevent the access of some of 
these registers by User-mode programs. 

General-Purpose Registers 

The Am29000 incorporates 192 general-purpose regis- 
ters. The organization of the general-purpose registers 
is diagrammed in Figure 4. 

General-purpose registers hold the following types of 
operands for program use: 

1. 32-bit data addresses 

2. 32-bit signed or unsigned integers 

3. 32-bit branch-target addresses 

4. 32-bit logical bit strings 

5. 8-bit signed or unsigned characters 

6. 1 6-bit signed or unsigned integers 

7. word-length Booleans 

8. single-precision floating-point numbers 

9. double-precision floating-point numbers (in two 
register locations) 

Because a large number of general-purpose registers 
are provided, a large amount of frequently used data 
can be kept on-chip, where access time is fastest. 

Am29000 instructions can specify two general-purpose 
registersforsource operands, and one general-purpose 
registerfor storing the instruction result. These registers 
are specified by three 8-bit instaiction fields containing 
register numbers. A register may be specified directly by 
the instruction, or indirectly by one of three special-pur- 
pose registers. 

Register Addressing 

The general-purpose registers are partitioned into 64 
global registers and 128 local registers, differentiated by 
the most-significant bit of the register number. The dis- 
tinction between global and local registers is the result of 
register-addressing considerations. 

The following terminology is used to describe the ad- 
dressing of general-purpose registers: 

1 . Register number— this is a software-level num- 
berfor a general-purpose register. For example, 
this is the number contained in an instmction 
field. Register numbers range from to 255. 

2. Global register number— this is a software-level 
number for a global register. Global register 
numbers range from to 127. 
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3. Local register number— this is a software-level 
number for a local register. Local register num- 
bers range from to 127. 

4. Absolute register number— this is a hardware- 
level number used to select a general-purpose 
register in the Register File. Absolute register 
numbers range from to 255. 

Global Registers 

When the nrrast-signif leant bit of a register number is 0, a 
global register is selected. The seven least-significant 
bits of the register number give the global register num- 
ber. For global registers, the absolute register number is 
equivalent to the register number. 

Global Registers 2 through 63 are unimplemented. An 
attempt to access these registers yields unpredictable 
results; however, they may be protected from User- 
mode access by the Register Bank Protect Register. 

The register numbers associated with Global Registers 
and 1 have special meaning. The number for Global 
Register specifies that an indirect pointer is to be used 
as the source of the register number; there is an indirect 
pointer for each of the instruction operand/result 
registers. Global Register 1 contains the Stack Pointer, 
which is used in the addressing of local registers as 
explained below. 

Local Register Stack Pointer 

The Stack Pointer is a 32-bit register that may be an op- 
erand of an instruction as any other general-purpose 
register. However, a shadow copy of Global Register 1 
is maintained by processor hardware to be used in local 
register addressing. This shadow copy is set only with 
the results of Arithmetic and Logical instructions. If the 
Stack Pointer is set with the result of any other instruc- 
tion class, local registers cannot be accessed predict- 
ably until the Stack Pointer is set once again with an 
Arithmetic or Logical instruction. 

Local Registers 

When the most-significant bit of a register number is 1 , a 
local register is selected. The seven least-significant 
bits of the register number give the local-register num- 
ber. For local registers, the absolute register number is 
obtained by adding the local register number to bits 8-2 
of the Stack Pointer and toincating the result to seven 
bits; the most-significant bit of the original register num- 
ber is unchanged (i.e., it remains a 1). 

The Stack Pointer addition applied to local register num- 
bers provides a limited form of base-plus-offset ad- 
dressing within the local registers. The Stack Pointer 
contains the 32-bit base address. This assists run-time 
storage management of variables for dynamically 
nested procedures. 
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Figure 4. General-Purpose Register Organization 



Register Banking 

For the purpose of access restriction, the general- 
purpose registers are divided into register banks. Regis- 
ter banks consist of 16 registers (except for Bank 0, 
whichcontainsUnimplemented Registers 2 through 15) 



and are partitioned according to absolute register num- 
bers, as shown in Figure 5. 

The Register Bank Protect Register contains 16 protec- 
tion bits, where each bit controls User-mode accesses 



1-38 



Am29000 



Register 

Bank Protect 

Register Bit 


Absolute- 
Register Numbers 


General-Purpose 
Registers 





2 through 15 


Bank 
(unimplemented) 


1 


16 through 31 


Bank 1 
(unimplemented) 


2 


32 through 47 


Bank 2 
(unimplemented) 


3 


48 through 63 


Bank 3 
(unimplemented) 


4 


64 through 79 


Bank 4 


5 


80 through 95 


Bank 5 


6 


96 through 111 


Bank 6 


7 


112 through 127 


Bank 7 


8 


128 through 143 


Bank 8 


9 


144 through 159 


Bank 9 


10 


160 through 175 


Bank 10 


11 


176 through 191 


Bank 11 


12 


192 through 207 


Bank 12 


13 


208 through 223 


Bank 13 


14 


224 through 239 


Bank 14 


15 


240 through 255 


Bank 15 



Figure 5. Register Bank Organization 



(read or write) to a bank of registers. Bits 0-15 of ttie 
Register Bank Protect Register protect Register Banks 
tfiroughi 15, respectively. 

Wiien abit in the Register Bank Protect Register is 1 and 
a register in the corresponding bank is specified as an 
operand register or result register by a User-mode in- 
struction, a Protection Violation trap occurs. Note that 
protection is based on absolute register numbers; in the 
case of local registers, Stack-Pointer addition is per- 
formed before protection checking. 



When the processor is in Supervisor mode, the Register 
Bank Protect Register has no effect on general-purpose 
register accesses. 

Indirect Accesses 

Specification of Global Register as an instruction-op- 
erand register or result register causes an indirect ac- 
cess to the general-purpose registers. In this case, the 
absolute register number is provided by an indirect 
pointer contained in a special-purpose register. 
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Each of the three possible registers for instruction exe- 
cution has an associated 8-bit indirect pointer. Indirect 
register numbers can be selected independently for 
each of the three operands. Since the indirect pointers 
contain absolute register numbers, the number in an 
indirect pointer is not added to the Stack Pointer when 
local registers are selected. 

The indirect pointers are set by the Move To Special 
Register, Floating-Point, MULTIPLY. MULTM. MULTI- 
PLU, MULTMU, DIVIDE. DIVIDU, SETIP, and EMU- 
LATE instructions. 

For a Move To Special Register instruction, an indirect 
pointer is set with bits 9-2 of the 32-bit source operand. 
This provides consistency between the addressing of 
words in general-purpose registers and the addressing 
of words in external devices or memories. A modifica- 
tion of an indirect pointer using a Move To Special Reg- 
ister has a delayed effect on the addressing of general- 
purpose registers. 

Forthe remaining instructions, all three indirect pointers 
are set, simultaneously, with the absolute register num- 
bers derived from the register numbers specified by the 
instruction. For any local registers selected by the in- 
struction, the Stack-Pointer addition is applied to the 
register numbers before the indirect pointers are set. 

Register numbers stored into the indirect pointers are 
checked for bank-protection violations — except when 
an indirect pointer is set by a Move-To-Special-Register 
instruction — at the time that the indirect pointers are set. 

Special-Purpose Registers 

The Am29000 contains 27 special-purpose registers. 
The organization of the special-purpose registers is 
shown in Figure 6. 

Special-purpose registers provide controls and data for 
certain processor operations. Some special-purpose 
registers are updated dynamically by the processor, in- 
dependent of software controls. Because of this, a read 
of a special-purpose register following a write does not 
necessarily get the data that was written. 

Some special-purpose registers have fields that are re- 
served for future processor implementations. When a 
special-purpose register is read, a bit in a reserved field 
is read as a 0. An attempt to write a reserved bit with a 1 
has no effect; however, this should be avoided because 
of upward-compatibility considerations. 

The special-purpose registers are accessed by explicit 
data movement only. Instructions that move data to or 
from a special-purpose register specify the special- 
purpose register by an 8-bit field containing a special- 
purpose register number. Register numbers are speci- 
fied directly by instructions. 

An attempted read of an unimplemented special-pur- 
pose register yields an unpredictable value. An at- 
tempted write of an unimplemented special-purpose 



register has no effect; however, this should be avoided, 
because of upward-compatibility considerations. 

The special-purpose registers are partitioned into pro- 
tected and unprotected registers. Special-purpose reg- 
isters numbered 0-127 and 160-255 are protected 
(note that not all of these are implemented). Special- 
purpose registers numbered 128-159 are unprotected 
(again, not ail are implemented). 

Protected special-purpose registers numbered 0-127 
are accessible only by programs executing in the Super- 
visor mode. An attempted read or write of a protected 
special-purpose register by a User-mode program 
causes a Protection Violation trap to occur. Protected 
special-purpose registers numbered 160-255 are not 
accessible by programs in either the User mode or the 
Supervisor mode. These register numbers identify vir- 
tual registers in the floating-point architecture. 

The Floating-Point Environment Register, Integer Envi- 
ronment Register, Floating-Point Status Register, and 
Exception Opcode Register are not implemented in 
processor hardware. These registers are implemented 
via a virtual floating-point interface provided on the 
Am29000. 

Unprotected special-purpose registers are accessible 
by programs executing in both the User and Supervisor 
modes. 

Vector Area Base Address (Register 0) 

This protected special-purpose register (see Figure 7) 
specifies the beginning address of the interrupt/trap 
Vector Area. The Vector Area is either a table of 256 
vectors that points to interrupt and trap handling 
routines, or a segment of 256 64-instruction blocks that 
directly contains the interrupt and trap handling 
routines. 

The organization of the Vector Area is determined by the 
Vector Fetch ( VF) bit of the Configuration Register. If the 
VF bit is 1 when an interrupt or trap is taken, the vector 
number for the intemjpt or trap (see Intermpts and 
Traps section) replaces bits 9-2 of the value in the 
Vector Area Base Address Register to generate the 
physical address for a vector contained in instruction/ 
data memory. 

If the VF bit is 0, the vector number replaces bits 1 5-6 of 
the value in the Vector Area Base Address Register to 
generate the physical address of the first instruction of 
the interrupt or trap handler. The instruction fetch forthis 
instruction is directed either to instruction memory or in- 
struction read-only memory, as determined by the ROM 
Vector Area (RV) bit of the Configuration Register. 

Bits 31-16: Vector Area Base (VAB)— The VAB field 
gives the beginning address of the Vector Area. This ad- 
dress is constrained to begin on a 64-kb address- 
boundary In instnjction data memory or instruction read- 
only memory. 
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Figure 6. Special-Purpose Registers 
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Figure 7. Vector Area Base Address Register 



Bits 1 5-0: Zeros— These bits force the alignment of the 
Vector Area. 

Old Processor Status (Register 1) 

This protected special-purpose register has the same 
format as the Current Processor Status described be- 
low. The Old Processor Status stores a copy of the Cur- 
rent Processor Status when an interrupt or trap is taken. 
This is required since the Current Processor Status will 
be modified to reflect the status of the interaipt/trap 
handler. 

During an interrupt return, the Old Processor Status is 
copied into the Current Processor Status. This allows 
the Current Processor Status to be set as required for 
the routine that is the target of the interrupt return. 

Current Processor Status (Register 2) 

This protected special-purpose register (see Figure 8) 
controls the behavior of the processor and its ability to 
recognize exceptional events. 

Bits 31-16: reserved. 

Bit 15: Coprocessor Active (CA)— The CA bit is set 

and reset underthe control of load and store instructions 
that transfer information to and from a coprocessor. This 
bit indicates that the coprocessor is performing an op- 
eration at the time that an interrupt or trap is taken. This 
notifies the interrupt ortrap handierthat the coprocessor 
contains state information to be presen/ed. Note that 
this notification occurs because the CA bit of the Old 
Processor Status is 1 in this case, not because of the 
value of the CA bit of the Cun-ent Processor Status. 

Bit 14: interrupt Pending (IP)— This bit allows soft- 
ware to detect the presence of external interrupts while 
they are disable d. The I P bit is set if one or more of the 
external signals INTRs-INTRo is active, but the proces- 
sor is disabled from taking the resulting interrupt due to 



the value of the DA, Dl, or IM bits. If all external interrupt 
signals subsequently are deasserted while still dis- 
abled, the IP bit is reset. 

Bits 13-12: Trace Enable, Trace Pending (TE, TP)— 
The TE and TP bits implement a software-controlled, in- 
struction single-step facility. Single stepping is not im- 
plemented directly, but rather emulated by trap se- 
quences controlled by these bits. The value of the TE bit 
is copied to the TP bit whenever an instmction execution 
is completed. When the TP bit is 1 , a Trace trap occurs. 

Bit 11: Trap Unaligned Access (TU)— The TU bit en- 
ables checking of address alignment for external data- 
memory accesses. When this bit is 1 , an Unaligned Ac- 
cess trap occurs if the processor either generates an ad- 
dress for an external word that is not aligned on a word 
address boundary (i.e., either of the least-significant two 
bits is 1), or generates an address for an external half- 
word that is not aligned on a half-word address bound- 
ary (i.e., the least-significant address bit is 1). When the 
TU bit is 0, data-memory address alignment is ignored. 

Alignment is ignored for input/output accesses and 
coprocessor transfers. The alignment of instruction ad- 
dresses is also ignored (unaligned instmction ad- 
dresses can be generated only by indirect jumps). Inter- 
rupt/trap vector addresses always are aligned properly. 

Bit 10: Freeze (FZ) — The FZ bit prevents certain regis- 
ters from being updated during interrupt and trap pro- 
cessing, except by explicit data movement. The affected 
registers are: Channel Address, Channel Data, Channel 
Control, Program Counter 0, Program Counter 1 , Pro- 
gram Counter 2, and the ALU Status Register. 

When the FZ bit is 1 , these registers hold their values. 
An affected register can be changed only by a Move To 
Special Register instmction. When the FZ bit is 0, there 
is no effect on these registers, and they are updated by 
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processor instruction execution as described in this 
manual. 

The FZ bit is set whenever an interrupt or trap is taken, 
holding critical state in the processor so that it is not 
modified unintentionally by the interrupt or trap handler. 

Bit 9: Lock (LK)— The LK bit controls th e value of the 
LOCK external signal. If the L K bit is 1 , the LOCK signal 
is active. If the LK bit is 0, the LOCK signal is controlled 
by the execution of the instructions Load and Set, Load 
and Lock, and Store and Lock. This bit is provided for 
the implementation of multiprocessor synchronization 
protocols. 

Bit 8: ROM Enable (RE)— The RE bit enables instruc- 
tion fetching from external instruction read-only memory 
(ROM). When this bit is 1. the IREQT signal directs all 
instruction requests to ROM. Instructions that are 
fetched from ROM are subject to capture and reuse by 
the Branch Target Cache when it is enabled; the Branch 
Target Cache distinguishes between instructions from 
ROM and those from non-ROM storage. When this bit 
is 0, off-chip requests for instructions are directed to 
instnjction/data memory. 

Bit 7: WAIT Mode (WM)— The WM bit places the pro- 
cessor in the Wait mode. When this bit is 1 , the proces- 
sor performs no operations. The Wait mode is reset by 
an interrupt ortrapforwhich the processor is enabled, or 
by the Reset mode. 

Bit 6: Physical Addressing/Data (PD)— The PD bit 
determines whether address translation is performed 
for load or store operations. Address translation is per- 
formed for an access only when this bit is 0, and the 
Physical Address (PA) bit in the load or store instruction 
causing the access is also 0. 

Bits: Physical Addressing/instructions (Pi)— The PI 
bit determines whether address translation is performed 
for external instruction accesses. Address translation is 
performed only when this bit is 0. 

Bit 4: Supervisor Mode (SM)— The SM bit protects 
certain processor context, such as protected special- 
purpose registers. When this bit is 1 , the processor is in 
the Supervisor mode, and access to all processor con- 
text is allowed. When this bit is 0, the processor is in the 
User mode, and access to protected processor context 
is not allowed; an attempt to access (either read or write) 
protected processor context causes a Protection Viola- 
tion trap. 
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For an external access, the User Access (UA) bit in the 
load or store instruction also controls access to pro- 
tected processor context. When the UA bit is 1, the 
Memory Management Unit and channel perform the ac- 
cess as though the program causing the access was in 
User mode. 

Bits 3-2: interrupt Mask (IM)— -The IM field is an en- 
coding of the processor priority with respect to external 
interrupts. The interpretation of the intermpt mask is 
specified by the following table: 



IM Value 



Result 





1 

1 

1 1 



INTRo enabled 
INTR,-INTRo enabled 
INTR^ -INTRq enabled 
INTRj-lNTRo enabled 



Bit 1 : Disable interrupts (Di)— The Dl bit prevents the 
processo r from being interrupted by external interrupt 
requests INTR3-INTR0. When this bit is 1 , the processor 
ignores all external interrupts. However, note that traps 
(both internal and external). Timer interrupts, and Trace 
traps will be taken. When this bit is 0, the processor will 
take any interrupt enabled by the IM field, unless the DA 
bit is 1. 

Bit 0: Disable aii interrupts and Traps (DA)— The DA 
bit prevents the processor from taking any interrupts 
and most traps. When this bit is 1 , th e proces sor ignores 
interrupts and traps, except for the WARN, Instruction 
Access Exception, Data Access Exception, and Co- 
processor Exception traps. When this bit is 0, all traps 
will be taken, and interrupts will be taken if othenwise 
enabled. 

Configuration (Register 3) 

This protected special-purpose register (see Figure 9) 
controls certain processor and system options. Most 
fields normally are nrodified only during system initial- 
ization. The Configuration Register definition follows. 

Bits 31-24: Processor Release Level (PRL)— The 
PRLf ield is an 8-bit, read-only identification numberthat 
specifies the processor version. 

Bits 2i3-€: reserved. 

Bit 5: Data Width Enable (DW)— The DW bit enables 
and disables byte and half-word external accesses. If 
the DW bit is 0, byte and half-word accesses are not per- 



31 


PRL 


23 15 7 
Reserved 





















1 
I 

d'v\ 


/; 


1 

RV 




1 

CP 





Figure 9. Configuration Register 



VF BO CD 



1-43 



29K Family CMOS Devices 



formed in hardware, and these accesses must be emu- 
lated by software. If the DW bit is 1 , byte and half-word 
accesses are performed by hardware: this requires that 
external devices and memories be able to write individ- 
ual bytes and half-words within a word. 

Bit 4: Vector Fetch (VF)— The VF bit determines the 
stmcture of the interrupt/trap Vector Area. If this bit is 1 , 
the Vector Area is defined as a block of 256 vectors that 
specify the beginning addresses of the interrupt and trap 
handling routines. If the VF bit is 0, the Vector Area is a 
segment of 256 64-instmction blocks that contain the 
actual routines. 

Bit 3: ROM Vector Area (RV)— If the VF bit is 0. the RV 

bit specifies whether the Vector Area is contained in 
instruction memory (RV = 0) or instmction read-only 
memory (RV = 1). The value of the RV bit is irrelevant if 
theVFbitisL 

Bit 2: Byte Order (BO)— The BO bit determines the or- 
dering of bytes and half-words within words. If the BO bit 
is 0, bytes and half-words are numbered left-to-right 
within a word. If the BO bit is 1 , bytes and half-words are 
numbered right-to-left. 

Bit 1: Coprocessor Present (CP)— The CP bit indi- 
cates the presence of a coprocessor that may be used 
by the processor. If this bit is 1 , it enables the execution 
of load and store instmctions that have a Coprocessor 
Enable (CE) bit of 1 . If the CP bit is and the processor 
attempts to execute a load or store instruction with a CE 
bit of 1, a Coprocessor Not Present trap occurs. This 
feature may beusedto emulate coprocessor operations 
as well as to protect the state of a coprocessor shared 
between multiple processes. 

Bit 0: Branch Target Cache Disable (CD)— The CD bit 
determines whether or not the Branch Target Cache is 
used for nonsequential instruction references. When 
this bit is 1 , all instruction references are directed to ex- 
ternal instmction memory or instruction ROM, and the 
Branch Target Cache is not used. When this bit is 0, the 
targets of nonsequential instruction fetches are stored in 
the Branch Target Cache and reused. The value of the 



CD bit does not take effect until the execution of the next 
branch instmction. 

Channel Address (Register 4) 

This protected special-purpose register (Figure 10) is 
used to report exceptions during external accesses or 
coprocessor transfers. It also is used to restart inter- 
mpled Load Multiple and Store Multiple operations, and 
to restart other external accesses when possible (e.g., 
after TLB misses are serviced). 

The Channel Address Register is updated on the execu- 
tion of every load or store instmction, and on every load 
or store in a Load Multiple or Store Multiple sequence, 
except when the Freeze (FZ) bit in the Cun^ent Proces- 
sor Status Register is 1 . 

Bits 31-0: Channel Address (CHA)— This field con- 
tains the address of the current channel transaction (if 
the FZ bit of the Current Processor Status Register is 0). 
For external data accesses, the address is virtual if ad- 
dress translation was enabled for the access, or physi- 
cal if translation was disabled. For transfers to the 
coprocessor, the CHA field contains data transferred to 
the coprocessor. 

Channel Data (Register 5) 

This protected special-purpose register (Figure 11) is 
used to report exceptions during external accesses or 
coprocessor transfers. It is also used to restart the first 
store of an intermpted Store Multiple operation and to 
restart other external accesses when possible (e.g., af- 
ter TLB misses are serviced). 

The Channel Data Register is updated on the execution 
of every load or store instmction, and on every Ipad or 
store in a Load Multiple or Store Multiple sequence, ex- 
cept when the Freeze (FZ) bit in the Current Processor 
Status Register is 1 . When the Channel Data Register is 
updated for a load operation, the resulting value is un- 
predictable. 

Bits 31-0: Channel Data (CHD)— This field contains 
the data (if any) associated with the current channel 
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transaction {if the FZ bit of the Current Processor Status 
Register is 0). If the current channel transaction is not a 
store or a transfer to the coprocessor, the value of this 
field is irrelevant. 

Channel Control (Register 6) 

This protected special-purpose register (Figure 12) is 
used to report exceptions during external accesses or 
coprocessor transfers. It also is used to restart inter- 
rupted Load fy^ultiple and Store Multiple operations, and 
to restart other external accesses when possible (e.g., 
after TLB misses are serviced). 

The Channel Control Register is updated on the execu- 
tion of every load or store instruction, and on every load 
or store in a Load Multiple or Store Multiple sequence, 
except when the Freeze (FZ) bit in the Current Proces- 
sor Status Register is 1 . 

Bits 31-24 — These bits are a direct copy of bits 23-16 
from the load or store instruction that started the current 
channel transaction. 

Bits 23-16: Load/Store Count Remaining (CR)— The 
CR field indicates the remaining number of transfers for 
a Load Multiple or Store Multiple operation that encoun- 
tered an exception or was interrupted before comple- 
tion. This number is zero-based; for example, a value of 
28 in this field indicates that 29 transfers remain to be 
completed. If the fault or interrupt occurs on the last 
transaction, the CR field contains a value of and the 
ML bit is 1 (see below). 

Bit 15: Load/Store (LB)— The LS bit is if the channel 
transaction is a store operation, and 1 if it is a load 
operation. 

Bit 14: Multiple Operation (ML)— The ML bit is 1 if the 
current channel transaction is a partially complete Load 
Multiple or Store Multiple operation; othenwise it is 0. 

Bit 13: Set (ST)— The ST bit is 1 if the current channel 
transaction is for a Load and Set instruction; otherwise it 
is 0. 

Bit 12: Lock Active (LA)— The LA bit is 1 if the current 
channel transaction is for a Load and Lock or Store and 
Lock instruction; othenwise it is 0. Note that this bit is not 
set as the result of the Lock (LK) bit in the Current Pro- 
cessor Status Register. 



Bit 11: reserved. 



Bit 1 0: Transaction Faulted (TF)— The TF bit indicates 
that the current channel transaction was not complete 
due to some exceptional circumsta nce. T his bit is set 
only for exceptions reported via the DERR input, and it 
causes a Data Access Exception or Coprocessor Ex- 
ception trap to occur (depending on the value of the CE 
bit) when it is 1 . 

The TF bit allows the proper sequencing of externally re- 
ported errors that get preempted by higher-priority 
traps; it is reset by software that handles the resulting 
trap. 

Bits 9-2: Target Register (TR>— The TR field indicates 
the absolute register number of data operand for the 
current transaction (either a load target or store data 
source). Since the register number in this field is abso- 
lute, it reflects the Stack-Pointer addition when the indi- 
cated register is a local register. 

Bit 1: Not Needed (NN)— The NN bit indicates that, 
even though the Channel Address, Channel Data, and 
Channel Control registers contain a valid representation 
of an uncompleted load operation, the data requested is 
not needed. This situation arises when a load instruction 
is overlapped with an instruction that writes the load tar- 
get register. 

Bit 0: Contents Valid (CV)— The CV bit indicates that 
the contents of the Channel Address, Channel Data, 
and Channel Control registers are valid. 

Register Bank Protect (Register 7) 

This protected special-purpose register (Figure 13) pro- 
tects banks of general-purpose registers from User- 
mode program accesses. 

The general-purpose registers are partitioned into 16 
banks of 1 6 registers each (except that Bank contains 
14 registers). The banks are organized as shown in 
Figure 4. 

Bits 31-16: reserved. 

Bits 15-0: Bank 15 through Bank Protection Bits 
(B15-B0) — In the Register Bank Protect Register, each 
bit is associated with a particular bank of registers and 
the bit number gives the associated bank number (e.g., 
B11 determines the protection for Bank 11). 
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Figure 12. Channel Control Register 
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Figure 13. Register Bank Protect Register 



When a protection bit is 1 , tlie corresponding bank is 
protected from access by programs executing in the 
User mode. A Protection Violation trap occurs when a 
User-mode program attempts to access (either read or 
write) a register in a protected bank. When a b'rt in this 
register is 0, the con-esponding bank is available to pro- 
grams executing in the User mode. 

Supervisor-mode programs are not affected by the Reg- 
ister Bank Protect Register. 

Register protection is based on absolute register num- 
bers. For local registers, the protection checking is per- 
formed after the Stack-Pointer addition is performed. 

Timer Counter (Register 8) 

This protected special-purpose register (Figure 14) 
contains the counter for the Timer Facility. 

Bits 31-24: reserved. 

Bits 23-0: Timer Count Value (TCV)— The 24-bit TCV 
field decrements by one on each processor clock. When 
the TCV field decrements to 0, it is reloaded with the 
content of the Timer Reload Value field in the Timer 
Reload Register. At this time, the Interrupt bit in the 
Timer Reload Register is set. 

Timer Reload (Registers) 

This protected special-purpose register (Figure 15) 
maintains synchronization of the Timer Counter Reg- 



ister, enables Timer intermpts, and maintains Timer 
Facility status information. 

Bits 31-27: reserved. 

Bit 26: Overflow (OV)— The OV bit indicates that a 
Timer intermpt occurred before a previous Timer inter- 
rupt was serviced. It is set if the Interrupt (IN) bit is 1 (see 
below) when the Timer Count Value (TCV) field of the 
Timer Counter Register decrements to 0. In this case, a 
Timer interrupt caused by the IN bit has not been ser- 
viced when another interrupt is created. 

Bit 25: interrupt (IN)— The IN bit is set whenever the 
TCV field decrements to 0. If this bit is 1 and the IE bit is 
also 1 , a Timer intemjpt occurs. Note that the IN bit is set 
when the TCV field decrements to 0, regardless of the 
value of the IE bit. The IN bit is reset by software that 
handles the Timer interrupt. 

The TCV field is zero-based with respect to the Timer in- 
terrupt interval; for example, a value of 28 in the TCV 
field causes the IN bit to be set in the 29th subsequent 
processor cycle. The reason for this is that the TCV field 
is for a complete cycle before the IN bit is set. 

Bit 24: Interrupt Enable (IE)— When the IE bit is 1 , the 
Timer interrupt is enabled, and the Timer intermpt oc- 
curs whenever the IN bit is 1. When this bit is 0, the 
Timer interrupt is disabled. Note that Timer interrupts 
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may be disabled by the DA bit of tiie Current Processor 
Status Register regardless of the value of the IE bit. 

Bits 23-0: Timer Reload Value (TRV)— The value of 
this field is written into the Timer Count Value (TCV) field 
of the Timer Counter Register when the TCV field decre- 
ments to 0. 

Program Counter (Register 10) 

This protected special-purpose register (Figure 16) is 
used on an interrupt return to restart the instruction that 
was in the decode stage when the original intermpt or 
trap was taken. 

Bits 31-2: Program Counter (PCO)— This field cap- 
tures the word address of an instruction as it enters the 
decode stage of the processor pipeline, unless the 
Freeze (FZ) bit of the Current Processor Status Register 
is 1 . If the FZ bit is 1 , PCO holds its value. 

When an intermpt or trap is taken, the PCO field contains 
the word address of the instruction in the decode stage; 
the interrupt or trap has prevented this instruction from 
executing. The processor uses the PCO field to restart 
this instmction on an interrupt return. 

Bits 1-0: Zeros— These bits are since instruction ad- 
dresses are always word-aligned. 

Program Counter i (Register 11) 

This protected special-purpose register (Figure 17) is 
used on an interrupt return to restart the instruction that 
was in the execute stage when the original interrupt or 
trap was taken. 
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Bits 31-2: Program Counter 1 (PCI)— This field cap- 
tures the word address of an instruction as it enters the 
execute stage of the processor pipeline, unless the 
Freeze (FZ) bit of the Current Processor Status Register 
is 1. If the FZ bit is 1 , PC1 holds its value. 

When an interrupt or trap is taken, the PC1 field contains 
the word address of the instruction in the execute stage; 
the interrupt or trap has prevented this instmction from 
completing execution. The processor uses the PC1 field 
to restart this instmction on an intermpt return. 

Bits 1-0: Zeros — These bits are since instmction ad- 
dresses are always word-aligned. 

Program Counter 2 (Register 12) 

This protected special-purpose register (Figure 18) re- 
ports the address of certain instructions causing traps. 

Bits 31-2: Program Counter 2 (PC2)— This field cap- 
tures the word address of an instruction as it enters the 
write-back stage of the processor pipeline, unless the 
Freeze (FZ) bit of the Current Processor Status Register 
is 1 . If the FZ bit is 1 , PC2 holds its value. 

When an interrupt or trap is taken, the PC2 field contains 
the word address of the instruction in the write-back 
stage. In certain cases, PC2 contains the address of the 
instruction causing a trap. The PC2 field is used to report 
the address of this instmction, and has no other use in 
the processor. 
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Figure 16. Program Counter Register 
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Bits 1-0: Zeros — These bits are since instruction ad- 
dresses are always word-aligned. 

MMU Configuration (Register 13) 

This protected special-purpose register (Figure 19) 
specifies parameters associated with the Memory Man- 
agement Unit (MMU). 

Bits 31-10: reserved. 

Bits 9-fl: Page Size (PS)— The PS field specifies the 
page size for address translation. The page size affects 
translation as discussed in the Merrx)ry Management 
section. The PS field has a delayed effect on address 
translation. At least one cycle of delay must separate an 
instruction that sets the PS field and an instruction that 
performs address translation. The PS field is encoded 
as follows: 



PS 


Page Size 





1 kb 


1 


2kb 


1 


4kb 


1 1 


8kb 



Bits 7-0: Process Identifier (PiD)— For translated 
User-mode loads and stores, this 8-bit field is compared 
to Task Identifier (TID) fields in Translation Look-Aside 
Buffer entries when address translation is performed. 
Forthe address translation to be valid, the PID field must 
matcti the TID field in an entry. This allows a separate 
32-bit virtual-address space to be allocated to each ac- 
tive User-mode process (within the limit of 255 such 
processes). Translated Supervisor-mode loads and 



stores use a fixed process identifierof 0, and require that 
the TID field be for successful translation. 

LRU Recommendation (Register 14) 

This protected special-purpose register (Figure 20) as- 
sists Translation Look-Aside Buffer (TLB) reloading by 
indicating the least recently used TLB entry in the re- 
quired replacement line. 

Bits 31-7: reserved. 

Bits 6-1: Least Recently Used Entry (LRU)— The 
LRU field is updated whenever a TLB miss occurs dur- 
ing an address translation. It gives the TLB register 
number of the TLB entry selected for replacement. The 
LRU field also is updated whenever a memory-protec- 
tion violation occurs; however, it has no interpretation in 
this case. 

Bit 0: Zero — The appended serves to identify Word 
of the TLB entry. 

Indirect Pointer C (Register 128) 

This unprotected special-purpose register (Figure 21) 
provides the RC-operand register number when an in- 
struction RC field has the value (i.e., when Global Reg- 
ister is specified). 

Bits 31-10: reserved. 

Bits 9-2: Indirect Pointer C (IPC)— The 8-bit IPC field 
contains an absolute register number for a general- 
purpose register. This number directly selects a register 
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Figure 19. MMU Configuration Register 










Figure 20. LRU Recommendation Register 
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Figure 21. Indirect Pointer C Register 
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(Stack-Pointer addition is not performed in the case of 
local registers). 

Bits 1-0: Zeros— The IPC field is aligned for compati- 
bility with word addresses. 

Indirect Pointer A (Register 129) 

This unprotected special-purpose register (Figure 22) 
provides the RA-operand register number when an in- 
struction RAfield has the value (i.e., when Global Reg- 
ister is specified). 

Bits 31-10: reserved. 

Bits 9-2: indirect Pointer A (iPA)— The 8-bit IPA field 
contains an absolute register number for either a 
general-purpose register or a local register. This num- 
ber directly selects a register (Stack-Pointer addition is 
not performed in the case of local registers). 

Bits 1-0: Zeros— The IPA field is aligned for compati- 
bility with word addresses. 

Indirect Pointer B (Register 130) 

This unprotected special-purpose register (Figure 23) 
provides the RB-operand register number when an in- 
struction RB field has the value (i.e., when Global Reg- 
ister is specified). 

Bits 31-10: reserved. 

Bits 9-2: indirect Pointer B (IPB>— The 8-bit IPB field 
contains an absolute register number for a general- 
purpose register. This number directly selects a register 
(Stack-Pointer addition is not performed in the case of 
local registers). 
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Bits 1-0: Zeros— The IPB field is aligned for compati- 
bility with word addresses. 

Q (Register 131) 

The Q Register is an unprotected special-purpose regis- 
ter (Figure 24). 

Bits 31-0: Quotient/IVIultiplier (Q)— During a se- 
quence of divide steps, this field holds the low-order bits 
of the dividend; it contains the quotient at the end of the 
divide. During a sequence of multiply steps, this field 
holds the multiplier; it contains the low-order bits of the 
result at the end of the multiply. 

For an integer divide instruction, the Q field contains the 
high-order bits of the dividend at the beginning of the in- 
struction, and contains the remainder upon completion 
of the instruction. 

ALU Status (Register 1 32) 

This unprotected special-purpose register (Figure 25) 
holds information about the outcome of Arithmetic/Logic 
Unit (ALU) operations as well as control for certain op- 
erations performed by the Execution Unit. 

Bits 31-12: reserved. 

Bit 1 1 : Divide Flag (OF)— The DF bit is used by the in- 
structions that implement division. This bit is set at the 
end of the division instructions either to 1 or to the com- 
plement of the 33rd bit of the ALU. When a Divide Step 
instruction is executed, the DF bit then determines 
whether an addition or subtraction operation is per- 
formed by the ALU. 
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Figure 22. Indirect Pointer A Register 



31 


23 




15 


7 









1 




Reserved 




1 

IPB 









Figure 23. Indirect Pointer B Register 
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Figure 25. ALU Status Register 



Bit 1 0: Overflow (V>— The V bit indicates tiiat the result 
of a signed, twos-complement ALU operation required 
more than 32 bits to represent the result correctly. The 
value of this bit is determined by exclusive ORing the 
ALU carry-out with the carry-in to the most-significant bit 
for signed, twos-complement operations. This bit is not 
used for any special purpose in the processor, and is 
provided for information only. 

Bit 9: Negative (N)— The N bit is set with the value of 
the most-significant bit of the result of an arithmetic or 
logical operation. If twos-complement overflow occurs, 
the N bit does not reflect the true sign of the result. This 
bit is used in divide operations. 

Bit 8: Zero (Z)— The Z bit indicates that the result of an 
arithmetic or logical operation is 0. This bit is not used for 
any special purpose in the processor, and is provided for 
information only. 

Bit 7: Carry (C)— The C bit stores the carry-out of the 
ALU for arithmetic operations. It is used by the add-with- 
carry and subtract-with-carry instructions to generate 
the carry into the Arithmetic/Logic Unit. 

Bits 6-5: Byte Pointer (BP>— The BP field holds a 2-bit 
pointer to a byte within a word. It is used by Insert Byte 
and Extract Byte instructions. The exact mapping of the 
pointer value to the byte position depends on the value 
of the Byte Order (BO) bit in the Configuration Register. 

The most-significant bit of the BP field is used to deter- 
mine the position of a half-word within a word for the In- 
sert Half-Word, Extract Half-Word, and Extract Half- 
Word, Sign-Extended instructions. The exact mapping 
of the most-significant bit to the half-word position de- 
pends on the value of the BO bit in the Configuration 
Register. 

The BP field is set by a Move To Special Register in- 
struction with either the ALU Status Register or the Byte 
Pointer Register as the destination. It is also set by a 



load or store instruction if the Set Byte Pointer (SB) bit in 
the instmction is 1. A load or store sets the BP field 
either with the two least-significant bits of the address (if 
the DW bit of the Configuration Register is 0) or with the 
complement of the Byte Order bit of the Configuration 
Register (if DW is 1). 

Bits 4-0: Funnel Shift Count (FC)— The FC field con- 
tains a 5-bit shift count for the Funnel Shifter. The Fun- 
nel Shifter concatenates two source operands into a sin- 
gle 64-bit operand and extracts a 32-bit result from this 
64-bit operand; the FC field specifies the number of bit 
positions from the most-significant bit of the 64-bit oper- 
and to the most-significant bit of the 32-bit result. The 
FC field is used by the Extract instruction. 

The FC field is set by a Move To Special Register in- 
stmction with either the ALU Status Register or the Fun- 
nel Shift Count Register as the destination. 

Byte Pointer (Registen 33) 

This unprotected special-purpose register (Figure 26) 
provides an alternate access to the BP field in the ALU 
Status Register. 

Bits 31-2: Zeros. 

Bits 1-0: Byte Pointer (BP)— This field allows a pro- 
gram to change the BPfield without affecting otherfields 
in the ALU Status Register. 

Funnel Shift Count (Register 134) 

This unprotected special-purpose register (Figure 27) 
provides an alternate access to the FC field in the ALU 
Status Register. 

Bits 31-5: Zeros. 
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Figure 27. Funnel Shift Count 



Bits 4-0: Funnel Shift Count (FC)— This field allows a 
program to change the FC field without affecting other 
fields in the ALU Status Register. 

Load/Store Count Remaining (Register 135) 

This unprotected special-purpose register (Figure 28) 
provides alternate access to the CR field in the Channel 
Control Register. 

Bits 31-8: Zeros. 

Bits 7-0: Load/Store Count Remaining (CR)— This 
field allows a program to change the CR field without af- 
fecting other fields in the Channel Control Register, and 
is used to initialize the value before a Load Multiple or 
Store Multiple instmction is executed. 

Floating-Point Environment (Register 160) 

This unprotected special-purpose register (Figure 29) 
contains control bits that affect the execution of floating- 
point operations. 

Bits 31-9: reserved. 

Bit 8: Fast Float Select (FF)— The FF bit being 1 en- 
ables fast floating-point operations, in which certain re- 
quirements of the IEEE floating-point specification are 
not met. This improves the performance of certain 
operations by sacrificing conformance to the IEEE 
specification. 

Bits 7-6: Floating-Point Round Mode (FRM)— This 
field specifies the default mode used to round the results 
of floating-point operations, as follows: 



FRM1-0 



Round Mode 



00 
01 

I 

I I 



Round to nearest 
Round to ^» 
Round to +00 
Round to zero 



Bit 5: Floating-Point Divide- By-Zero Mask (DM)— If 
the DM bit is 0, a Floating-Point Exception trap occurs 
when the divisor of a floating-point division operation is 
zero and the dividend is a non-zero, finite number. If the 
DM bit is 1 , a Floating-Point Exception trap does not oc- 
cur for divide-by-zero. 

Bit 4: Floating-Point Inexact Result Mask (XM) — If 

the XM bit is 0, a Floating-Point Exception trap occurs 
when the result of a floating-point operation is not equal 
to the infinitely precise result. If the XM bit is 1 , a Float- 
ing-Point Exception trap does not occur for an inexact 
result. 

Bit 3: Floating-Point Underflow Mask (UM)— If the 
UM bit is 0, a Floating-Point Exception trap occurs when 
the result of a floating-point operation is too small to be 
expressed in the destination format. If the UM bit is 1 , a 
Floating-Point Exception trap does not occur for under- 
flow. 

Bit 2: Floating-Point Overflow Mask (VM)— If the VM 
bit is 0, a Floating-Point Exception trap occurs when the 
result of a floating-point operation is too large to be ex- 
pressed in the destination format. If the VM bit is 1 , a 
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Figure 28. Load/Store Count Remaining 
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Figure 29. Floating-Point Environment 
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Floating-Point Exception trap does not occur for over- 
flow. 

Bit 1 : Floating-Point Reserved Operand Masit {RM) 
— If the RM bit is 0, a Floating-Point Exception trap oc- 
curs when one or nx)re input operands to a floating-point 
operation is a reserved value, or when the result of a 
floating-point operation is a reserved value. If the RM bit 
is 1 , a Floating-Point Exception trap does not occur for 
reserved operands. 

Bit 0: Floating-Point Invalid Operation fAask (NIUI>— 
If the NM bit is 0, a Floating-Point Exception trap occurs 
when the input operands to a floating-point operation 
produce an indetenninate result (e.g., <» times 0). If the 
NM bit is 1 , a Floating-Point Exception trap does not oc- 
cur for invalid operations. 

Integer Environment (Register 161) 

This unprotected special-purpose register (Figure 30) 
contains control bits that affect the execution of integer 
operations. 

Bits 31-2: reserved. 

Bit 1 : integer Division Overflow Masl( (DO)— If the 
DO bit is 0, an Out of Range trap occurs when overflow 
of a signed or unsigned 32-bit result occurs during DI- 
VIDE or DIVIDU instmctions, respectively. If the DO bit 
is 1 , an Out of Range trap does not occur for overflow 
during integer divide operations. 

The DIVIDE and DIVIDU instructions always cause an 
Out of Range trap upon division by 0, regardless of the 
value of the DO bit. 

Bit 0: integer IVIultiplication Overflow Exception 
Masit (MO)— If the MO bit is 0, an Out of Range trap oc- 
curs when overflow of a signed or unsigned 32-bit result 



occurs during MULTIPLY orMULTIPLU instructions, re- 
spectively. If the DO bit is 1 , an Out of Range trap does 
not occurforoverflow during integermultiply operations. 

Floating-Point Status (Reglsterl62) 

This unprotected special-purpose register (Figure 31) 
contains status bits indicating the outcome of floating- 
point operations. The bits of the Floating-Point Status 
Register are divided into two groups of status bits. The 
bits in each group correspond to the causes of Floating- 
Point Exception traps that are enabled and disabled by 
bits 5-0 of the Floating-Point Environment Register. 

The first group of status bits (bits 13-8) are trap status 
bits that report the cause of a Floating-Point Exception 
trap. The trap status bits are set only when a Floating- 
Point Exception trap occurs, and indicate all conditions 
that apply to the trapping operation. All other opferations 
leave the status bits unchanged. A trap status bit is 
set regardless of the state of the corresponding mask 
bit of the Floating-Point Environment Register, except 
that at least one of the mask bits must be for the trap 
to occur. When a Floating-Point Exception trap occurs, 
all trap status bits not relevant to the trapping operation 
are reset. 

The second group of status bits (bits 5-0) are sticky 
status bits that, once set, remain set until explicitly 
cleared by a Move to Special Register (MTSR) or Move 
to Special Register Immediate (MTSRIM) instaiction. 
A sticky status bit is set only when a floating-point 
exception is detected and the corresponding mask bit 
of the Floating-Point Environment Register is 1 . That is, 
the sticky status bit is set only if the corresponding cause 
of a Floating-Point Exception trap is disabled. Normally, 
this means that sticky status bits are not set when a 
Floating-Point Exception trap is taken. However, if 
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multiple exceptions are detected, a sticky status bit 
corresponding to a masked exception may still be set if 
a Floating-Point Exception trap occurs for an unmasked 
exception. 

Bits 31-14: reserved. 

Bit 13: Floating-Point Divide-By-Zero Trap (DT>— 

The DT bit is set when a Floating-Point Exception trap 
occurs, and the associated floating-point operation is a 
divide with a zero divisor and a non-zero, finite dividend. 
Otherwise, this bit is reset when a Floating-Point Excep- 
tion trap occurs. 

Bit 1 2: Floating-Point Inexact Result Trap (XT)— The 
XT bit is set when a Floating-Point Exception trap oc- 
curs, and the result of the associated floating-point op- 
eration is not equal to the infinitely precise result. Other- 
wise, this bit is reset when a Floating-Point Exception 
trap occurs. 

Bit 11 : Floating-Point Underflow Trap (UT)— The UT 
bit is set when a Floating-Point Exception trap occurs, 
and the result of the associated floating-point operation 
is too small to be expressed in the destination format. 
Otherwise, this bit is reset when a Floating-Point Excep- 
tion trap occurs. 

Bit 10: Floating-Point Overflow Trap (VT)— The VT 
bit is set when a Floating-Point Exception trap occurs, 
and the result of the associated floating-point operation 
is too large to be expressed in the destination format. 
Otherwise, this bit is reset when a Floating-Point Excep- 
tion trap occurs. 

Bit 9: Floating-Point Reserved Operand Trap (RT)— 

The RT bit is set when a Floating-Point Exception trap 
occurs, and either one or more input operands to the as- 
sociated floating-point operation is a reserved value or 
the result of this floating-point operation is a reserved 
value. Otherwise, this bit is reset when a Floating-Point 
Exception trap occurs. 

Bit 8: Floating-Point Invalid Operation Trap (NT>— 

The NT bit is set when a Floating-Point Exception trap 
occurs, and the input operands to the associated float- 
ing-point operation produce an indeterminate result. 
Otherwise, this bit is reset when a Floating-Point Excep- 
tion trap occurs. 

Bits 7-6: reserved. 

Bit 5: Floating-Point Divide-By-Zero Sticky (DS>— 

The DS bit is set when the DM bit of the Floating-Point 
Environment Register is 1 , the divisor of a floating-point 
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division operation is a 0, and the dividend is a non-zero, 
finite number. 

Bit 4: Floating-Point Inexact Result Sticky (XS>— 

The XS bit is set when the XM bit of the Floating-Point 
Environment Register is 1, and the result of a floating- 
point operation is not equal to the infinitely precise 
result. 

Bit 3: Floating-Point Underflow Sticky (US)— The US 

bit is set when the UM bit of the Floating-Point Environ- 
ment Register is 1 , and the result of a floating-point op- 
eration is too small to be expressed in the destination 
format. 

Bit 2: Floating-Point Overflow Sticky (VS)— The VS 
bit is set when the VM bit of the Floating-Point Environ- 
ment Register is 1 , and the result of a floating-point op- 
eration is too large to be expressed in the destination 
format. 

Bit 1: Floating-Point Reserved Operand Sticky 
(RS)— The RS bit is set when the RM bit of the Floating- 
Point Environment Register is 1, and either one or more 
input operands to a floating-point operation is a re- 
served value or the result of a floating-point operation is 
a reserved value. 

Bit 0: Floating-Point invalid Operation Sticky {US}— 

The NS bit is set when the NM bit of the Floating-Point 
Environment Register is 1, and the input operands to 
a floating-point operation produce an indeterminate 
result. 

Exception Opcode (Register 164) 

This unprotected special-purpose register (Figure 32) 
reports the operation code (opcode) of an instruction 
causing a trap. It is provided primarily for recovery from 
floating-point exceptions, but reports the opcode of any 
trapping instruction. 

Bits 31-6: reserved. 

Bits 7-0: instruction Opcode (lOP) — This field cap- 
tures the opcode of an instruction causing a trap as a re- 
sult of instaiction execution; the opcode is captured as 
the instruction enters the write-back stage of the proces- 
sor pipeline. Instructions that do not trap as a conse- 
quence of execution do not modify the lOP field. 
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TLB Registers 

The Am29000 contains 128 Translation Look-Aside 
Buffer (TLB) registers. Tfie organization of the TLB reg- 
isters is shown in Figure 33. 

The TLB registers comprise the TLB entries, and are 
provided so that programs may inspect and alter TLB 
entries. This allows the loading, invalidation, saving, 
and restoring of TLB entries. 

TLB registers have fields that are reserved for future 
processor implementations. When a TLB register is 
read, a bit in a reserved field is read as a 0. An attempt to 
write a reserved bit with a 1 has no effect; however, this 
should be avoided because of upward-compatibility 
considerations. 

The Translation Look-Aside Buffer (TLB) registers are 
accessed only by explicit data movement by Su- 
pervisor-mode programs. Instructions that move data to 
or from a TLB register specify a general-purpose regis- 
ter containing a TLB register number. The TLB register 
number is given by the contents of bits 6-0 of the 
general-purpose register. TLB register numbers may 



only be specified indirectly by general-purpose 
registers. 

TLB entries are accessed as registers numbered 
0-127. Since two words are required to completely 
specify a TLB entry, two registers are required for each 
TLB entry. The words corresponding to an entry are 
paired as two sequentially numbered registers starting 
on an even-numbered register. The word with the even 
register number is called Word 0, and the word with the 
odd register number is called Word 1. The entries for 
TLB Set are in registers numbered 0-63, and the en- 
tries for TLB Set 1 are in registers numbered 64-127. 

TLB Entry Word 

The TLB Entry Word register is shown in Figure 34. 

Bits 31-15: Virtual Tag (VTAG)— When the TLB is 
searched for an address translation, the VTAG field of 
the TLB entry must match the most significant 17, 16, 
1 5, or 1 4 bits of the address being translated— for page 
sizes of 1 , 2, 4, and 8 kb, respectively— for the search to 
be successful. 



TLB Reg# 


1 

2 



62 
63 



TLB Set 


TLB Entry Line Word 


TLB Entry Line Word 1 


TLB Entry Line 1 Word 


TLB Entry Line 1 Word 1 


• 
• 
• 


TLB Entry Line 31 Word 


TLB Entry Line 31 Word 1 



64 
65 



126 
127 



TLB Set 1 


TLB Entry Line Word 


TLB Entry Line Word 1 


• • 
• ■ 


TLB Entry Line 31 Word 


TLB Entry Line 31 Word 1 



Figure 33. Translation Look-Aside Buffer Registers 
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UE 



Figure 34. TLB Entry Word 



When software loads a TLB entry with an address trans- 
lation, the most significant 14 bits of the Virtual Tag are 
set with the most significant 1 4 bits of the virtual address 
whose translation is being loaded into the TLB. The re- 
maining 3 bits of the Virtual Tag must be set either to the 
corresponding bits of the address orto Os, depending on 
the page size, as follows ("A" refers to corresponding 
address bits): 

. Page Size VTAG 2-0 (TLB Word bits 1 7-1 5) 



1 kb 
2kb 
4kb 
8kb 



AAA 
AAO 
AOO 
000 



Bit 1 4: Valid Entry (VE)— If this bit is 1 , the associated 
TLB entry is valid; if it is 0, the entry is invalid. 

Bit 13: Supervisor Read (SR)— If the SR bit is 1 , Su- 
pervisor-mode load operations from the virtual page are 
allowed; if it is 0, Supervisor-mode loads are not 
allowed. 

Bit 12: Supervisor Write (SW)— If the SW bit is 1 , Su- 
pervisor-mode store operations to the virtual page are 
allowed; if it is 0, Supervisor-mode stores are not 
allowed. 

Bit 1 1 : Supervisor Execute (SE)— If the SE bit is 1 , Su- 
pervisor-mode instruction accesses to the virtual page 
are allowed; if it is 0, Supervisor-mode instruction 
accesses are not allowed. 

Bit 10: User Read (UR)— If the UR bit is 1 , User-mode 
load operations from the virtual page are allowed; if it is 
0, User-mode loads are not allowed. 

Bit 9 : User Write (UW)— If the U W bit is 1 , User-mode 
store operations to the virtual page are allowed; if it is 0, 
User-mode stores are not allowed. 



Bit 8: User Execute (UE)— If the UE bit is 1 , User-mode 
instruction accesses to the virtual page are allowed; if it 
is 0, User-mode instruction accesses are not allowed. 

Bits 7-0: Tasit Identifier (TiD)— When the TLB is 
searched for an address translation, the TID must match 
the Process Identifier (PID) in the f^MU Configuration 
Registerfor the translation to be successful. This field is 
allows the TLB entry to be associated with a particular 
process. 

TLB Entry Word 1 

The TLB Entry Word 1 register is shown in Figure 35. 

Bits 31-10: Real Page Number (RPN)— The RPN field 
gives the most significant 22, 21, 20, or 19 bits of the 
physical address of the page for page sizes of 1 , 2, 4, 
and 8 Kb, respectively. It is concatenated to bits 9-0, 
10-0, 11-0, or 12-0 of the address being translated — 
for 1-, 2-, 4-, and 8-kb page sizes, respectively — to form 
the physical address for the access. 

When software loads a TLB entry with an address trans- 
lation, the most significant 1 9 bits of the Real Page Num- 
ber are set with the most significant 1 9 bits of the physi- 
cal address associated with the translation. The remain- 
ing 3 bits of the Real Page Number must be set either to 
the corresponding bits of the physical address, or to Os, 
depending on the page size, as follows ("A" refers to cor- 
responding address bits): 



Page Size 



RPN 2-0 (TLB Word 1 bits 12-10) 



1 kb 


AAA 


2kb 


AAO 


4kb 


AOO 


8kb 


000 



Bits 7-6: User Programmable (PGM) — ^These bits are 
placed on the MPGMi-MPGMo outputs when the ad- 



31 



23 



15 



RPN 


res 


PGM 


1 

res 


U 





Figure 35. TLB Entry Word 1 
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dress is transmitted for an access. They have no 
predefined effect on the access; any effect is defined by 
logic external to the processor. 

Bit 1 : Usage (U) — ^This bit indicates which entry in a 
given TLB line was least recently used to perform an ad- 
dress translation. If this bit is a 0, then the entry In Set 
in the line is least recently used; if it is 1 , then the entry in 
Set 1 is least recently used. This bit has an equal value 
for both entries in a line. Whenever a TLB entry is used 



to translate an address, the Usage bit of tx)th entries in 
the line used for translation are set according to the TLB 
set containing the translation. This bit is set whenever 
the translation is valid, regardless of the outcome of 
memory-protection checking. 

Bit 0: Input/Output (I0>— The lO bit determines 
whether the access is directed to the instruction/data 
memory (10=0) or the input/output (10 = 1) address 
space. 
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INSTRUCTION SET 

The Am29000 implements 1 17 instructions. All instruc- 
tions execute in a single cycle except for IRET, 
IRETINV, LOADM, STOREM, and the trapping arithme- 
tic instructions such as floating-point instructions. 

Most instmctions deal with general-purpose registers 
for operands and results; however, in most instructions, 
an 8-bit constant can be used in place of a register- 
based operand. Some instructions deal with special- 
purpose registers, TLB registers, external devices and 
memories, and coprocessors. 

This section describes the nine instruction classes in the 
Am29000, and provides a brief summary of instruction 
operations. 

If the processor attempts to execute an instruction that is 
not implemented, an Illegal Opcode trap occurs. 

Integer Arithmetic 

The Integer Arithmetic instructions perform add, sub- 
tract, multiply, and divide operations on word-length in- 
tegers. Certain instmctions in this class cause traps if 
signed or unsigned overflow occurs during the execu- 
tion of the instruction. There is support for multi-preci- 
sion arithmetic on operands whose lengths are multi- 
ples of words. All instmctions in this class set the ALU 
Status Register. The integer arithmetic instmctions are 
shown in Figure 36. 

The instmctions MULTIPLU. I^ULTI^IU, MULTIPLY, 
MULTM, DIVIDE, and DIVIDU are not implemented di- 
rectly by processor hardware, but cause traps to occur 
in instmction-emulation routines. 

Compare 

The Compare instmctions test for various relationships 
between two values. For all Compare instmctions 
except the CPBYTE instmction, the comparisons are 
performed on word-length signed or unsigned integers. 
There are two types of Compare instmctions. The first 
type places a Boolean value reflecting the outcome of 
the compare into a general-purpose register. For the 
second type (assert instmctions), instmction execution 
continues only if the comparison is true; othenwise a 
trap occurs. The assert instmctions specify a vector for 
the trap. 

The assert instmctions support mn-time operand 
checking and operating-system calls. If the trap occurs 
in the User mode and a trap number between and 
63 is specified by the instmction, a Protection Violation 
trap occurs. The Compare instmctions are shown in 
Figure 37. 

Logical 

The Logical instmctions perform a set of bit-by-bit 
Boolean functions on word-length bit strings. All instmc- 
tions in this class set the ALU Status Register. These in- 
stmctions are shown in Figure 38. 
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Shift 

The Shift instmctions (Figure 39) perform arithmetic 
and logical shifts. All but the Extract instmction operate 
on word-length data and produce a word-length result. 
The Extract instmction operates on double-word data 
and produces a word-length result. If both parts of the 
double word for the Extract instmction are from the 
same source, the Extract operation is equivalent to a ro- 
tate operation. For each operation, the shift count is a 
5-bit integer, specifying a shift amount in the range of 
to 31 bits. 

Data Movement 

The Data Movement instmctions (Figure 40) move 
bytes, half-words, and words between processor regis- 
ters. In addition, they move data between general- 
purpose registers and external devices, memories, and 
the coprocessor. 

Constant 

The Constant instmctions (Figure 41) provide the ability 
to place half-word and word constants into registers. 
Most instmctions in the instmction set allow an 8-bit con- 
stant as an operand. The Constant instmctions allow the 
constmction of larger constants. 

Floating-Point 

The Floating-Point instmctions (Figure 42) provide op- 
erations on single-precision (32-bit) or double-precision 
(64-bit) floating-point data. In addition, they provide con- 
versions between single-precision, double-precision, 
and integer number representations. In the current pro- 
cessor implementation, these instmctions cause traps 
to occur in routines that perform the floating-point op- 
erations. 

Branch 

The Branch instmctions (Figure 43) control the execu- 
tion flow of instmctions. Branch target addresses may 
be absolute, relative to the Program Counter (with the 
offset given by a signed instmction constant), or con- 
tained in a general-purpose register. For conditional 
jumps, the outcome of the jump is based on a Boolean 
value in a general-purpose register. Procedure calls are 
unconditional and save the return address in a general- 
purpose register. All branches have a delayed effect; 
the instmction sequence following the branch is exe- 
cuted regardless of the outcome of the branch. 

Miscellaneous 

The Miscellaneous instmctions (Figure 44) perform 
various operations that cannot be grouped into other in- 
stmction classes. In certain cases, these are control 
functions available only to Supervisor-mode programs. 
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Mnemonic 


Operation Description 


ADD 


DEST<-SRCA + SRCB 


ADDS 


DEST<-SRCA + SRCB 

IF signed overflow THEN Trap (Out Of Range) 


ADDU 


DESToSROA + SRCB 

IF unsigned overflow THEN Trap (Out Of Range) 


ADDC 


DEST <-SRCA + SRCB + C 


ADDCS 


DEST <-SRCA + SRCB + C 

IF signed overflow THEN Trap (Out Of Range) 


ADDCU 


DEST <-SRCA + SRCB + C 

IF unsigned overflow THEN Trap (Out Of Range) 


SUB 


DEST <-SRCA- SRCB 


SUBS 


DEST <-SRCA- SRCB 

IF signed overflow THEN Trap (Out Of Range) 


SUBU 


DEST <-SRCA- SRCB 

IF unsigned underflow THEN Trap (Out Of Range) 


SUBC 


DEST<-SRCA-SRCB-1 + C 


SUBCS 


DEST <-SRCA - SRCB -1 + C 

IF signed overflow THEN Trap (Out Of Range) 


SUBCU 


DEST <-SRCA - SRCB -1 + C 

IF unsigned underflow THEN Trap (Out Of Range) 


SUBR 


DEST<-SRCB-SRCA 


SUBRS 


DEST<-SRCB-SRCA 

IF signed overflow THEN Trap (Out Of Range) 


SUBRU 


DEST<-SRCB-SRCA 

IF unsigned underflow THEN Trap (Out Of Range) 


SUBRC 


DESToSRCB-SRCA-1+C 


SUBRCS 


DEST <-SRCB - SRCA -1 + C 

IF signed overflow THEN Trap (Out Of Range) 


SUBRCU 


DEST <-SRCB- SRCA -1 + C 

IF unsigned underflow THEN Trap (Out Of Range) 


MULTIPLU 


DEST <-SRCA * SRCB (unsigned) 


MULTIPLY 


DEST <-SRCA * SRCB (signed) 


MUL 


Perform 1 -bit step of a multiply operation (signed) 


MULL 


Complete a sequence of multiply steps 


MULTM 


DEST <-SRCA * SRCB (signed), most-significant bits 


MULTMU 


DEST <-SRCA * SRCB (unsigned), most-significant bits 


MULU 


Perform 1 -bit step of a multiply operation (unsigned) 


DIVIDE 


DEST <-(Q//SRCA)/SRCB (signed) Q oRemainder 


DIVIDU 


DEST <-(Q//SRCA)/SRCB (unsigned) Q <-Remainder 


DIVO 


Initialize for a sequence of divide steps (unsigned) 


DIV 


Perform 1 -bit step of a divide operation (unsigned) 


DIVL 


Complete a sequence of divide steps (unsigned) 


DIVREM 


Generate remainder for divide operation (unsigned) 



Figure 36. integer Arithmetic instructions 
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Mnemonic 


Operation Description 


CPEQ 


IF SRCA = SRCB THEN DEST oTRUE 
ELSE DEST <-FALSE 


CPNEQ 


IF SRCA o SRCB THEN DEST oTRUE 
ELSE DEST <-FALSE 


CPLT 


IF SRCA < SRCB THEN DEST <-TRUE 
ELSE DEST <-FALSE 


CPLTU 


IF SRCA < SRCB (unsigned) THEN DEST <-TRUE 
ELSE DEST <-FALSE 


CPLE 


IF SRCA <= SRCB THEN DEST <-TRUE 
ELSE DEST <- FALSE 


CPLEU 


IF SRCA <- SRCB (unsigned) THEN DEST <-TRUE 
ELSE DEST <-FALSE 


CPGT 


IF SRCA > SRCB THEN DEST <-TRUE 
ELSE DEST <-FALSE 


CPGTU 


IF SRCA > SRCB (unsigned) THEN DEST <-TRUE 
ELSE DEST <-FALSE 


CPGE 


IF SRCA >= SRCB THEN DEST <-TRUE 
ELSE DEST oFALSE 


CPGEU 


IF SRCA >= SRCB (unsigned) THEN DEST <-TRUE 
ELSE DEST <-FALSE 


CPBYTE 


IF (SRCA.BYTEO = SRCB.BYTEO) OR 

(SRCA.BYTE1 = SRCB.BYTE1) OR 

(SRCA.BYTE2 = SRCB.BYTE2) OR 

(SRCA.BYTE3 = SRCB.BYTE3)THEN DEST <-TRUE 
ELSE DEST <-FALSE 


ASEQ 


IF SRCA = SRCB THEN Continue 
ELSE Trap (VN) 


ASNEQ 


IF SRCA o SRCB THEN Continue 
ELSE Trap (VN) 


ASLT 


IF SRCA < SRCB THEN Continue 
ELSE Trap (VN) 


ASLTU 


IF SRCA < SRCB (unsigned) THEN Continue 
ELSE Trap (VN) 


ASLE 


IF SRCA <= SRCB THEN Continue 
ELSE Trap (VN) 


ASLEU 


IF SRCA <= SRCB (unsigned) THEN Continue 
ELSE Trap (VN) 


ASGT 


IF SRCA > SRCB THEN Continue 
ELSE Trap (VN) 


ASGTU 


IF SRCA > SRCB (unsigned) THEN Continue 
ELSE Trap (VN) 


ASGE 


IF SRCA >= SRCB THEN Continue 
ELSE Trap (VN) 


ASGEU 


IF SRCA >= SRCB (unsigned) THEN Continue 
ELSE Trap (VN) 



Figure 37. Compare Instructions 
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Mnemonic 


Operation Description 


AND 


DEST<-SRCA&SRCB 


ANDN 


DEST<-SRCA&~SRCB 


NAND 


DEST<-~{SRCA&SRCB) 


OR 


DEST<-SRCA|SRCB 


NOR 


DEST <-~ (SRCA 1 SRCB) 


XOR 


DESToSRCA'^SRCB 


XNOR 


DEST <-~ (SRCA -^ SRCB) 



Figure 38. Logical instructions 



Mnemonic 


Operation Description 


SLL 


DEST <-SRCA « SRCB (zero fill) 


SRL 


DEST oSRCA » SRCB (zero fill) 


SRA 


DEST <-SRCA » SRCB (sign fill) 


EXTRACT 


DEST <-high-order word of (SRCA//SRCB « PC) 



Figure 39. Shift Instructions 



Reserved instructions 

Sixteen Am29000 operation codes are reserved for 
instruction emuiation. Thiese instructions cause traps, 
mucii iii<e thie floating-point instructions, but currently 
iiave no specified interpretation. The relevant operation 
codes and the corresponding trap vectors are: 



These instructions are intended for future processor 
enhancennents, and users desiring compatibility with fu- 
ture processor versions should not use them for any 
purpose. 



Operation Codes 
(hexadecimal) 



Trap Vector 
Numbers (decimal) 



D8-DD 
E7-E9 
F8 
FA-FF 



24-29 
39-41 
56 
58-63 
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Mnemonic 


Operation Description 


LOAD 


DEST <-EXTERNAL WORD [SRCB] 


LOADL 


DEST <-EXTERNAL WORD [SRCB] 
assert* LOCK output during access 


LOADSET 


DEST <-EXTERNAL WORD [SRCB] 
EXTERNAL WORD [SRCB] <-h'FFFFFFFP. 


assert LOCK output during access 


LOADM 


DEST.. DEST + COUNT <- 
EXTERNAL WORD [SRCB] .. 
EXTERNAL WORD [SRCB + COUNT * 4] 


STORE 


EXTERNAL WORD [SRCB] <-SRCA 


STOREL 


EXTERNAL WORD [SRCB] <-SRCA 
assert LOCK output during access 


STOREM 


EXTERNAL WORD [SRCB] .. 

EXTERNAL WORD [SRCB + COUNT M] <- 

SRCA..SRCA + COUNT 


EXBYTE 


DEST <-SRCB, with low-order byte replaced 
by byte in SRCA selected by BP 


EXHW 


DEST <-SRCB, with low-order half-word replaced 
by half-word in SRCA selected by BP 


EXHWS 


DEST <- half-word in SRCA selected by BP. 
sign-extended to 32 bits 


INBYTE 


DEST <-SRCA. with byte selected by BP replaced 
by low-order byte of SRCB 


INHW 


DEST <-SRCA. with half-word selected by BP replaced 
by low-order half-word of SRCB 


MFSR 


DEST<-SPECIAL 


MFTLB 


DEST <-TLB [SRCA] 


MTSR 


SPDEST<-SRCB 


MTSRIM 


SPDEST<-0I16 


MTTLB 


TLB [SRCA] <-SRCB 



Figure 40. Data Movement Instructions 



Mnemonic 


Operation Description 


CONST 


DEST<-0I16 


CONSTH 


Replace high-order half-word of SRCA by 11 6 


CONSTN 


DEST<-1i16 



Figure 41. Constant Instructions 
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Mnemonic 


Operation Description 


FADD 


DEST (single-precision) <-SRCA (single-precision) 
+ SRCB (single-precision) 


DADD 


DEST (double-precision) <-SRCA (double-precision) 
+ SRCB (double-precision) 


FSUB 


DEST (single-precision) <-SRCA (single-precision) 
-SRCB (single-precision) 


DSUB 


DEST (double-precision) <-SRCA (double-precision) 
-SRCB (double-precision) 


FMUL 


DEST (single-precision) <-SRCA (single-precision) 
* SRCB (single-precision) 


FDMUL 


DEST (double-precision) <-SRCA (single-precision) 
* SRCB (single-precision) 


DMUL 


DEST (double-precision) <-SRCA (double-precision) 
* SRCB (double-precision) 


FDIV 


DEST (single-precision) <-SRCA (single-precision)/ 
SRCB (single-precision) 


DDIV 


DEST (double-precision) oSRCA (double-precision)/ 
SRCB (double-precision) 


FEQ 


IF SRCA (single-precision) = SRCB (single-precision) 

THEN DEST <-TRUE 
ELSE DEST <-FALSE 


DEQ 


IF SRCA (double-precision) = SRCB (double-precision) 

THEN DEST <-TRUE 
ELSE DEST <-FALSE 


FGE 


IF SRCA (single-precision) >= SRCB (single-precision) 

THEN DEST <-TRUE 
ELSE DEST <-FALSE 


DGE 


IF SRCA (double-precision) >= SRCB (double-precision) 

THEN DEST <-TRUE 
ELSE DEST <-FALSE 


FGT 


IF SRCA (single-precision) > SRCB (single-precision) 

THEN DEST <-TRUE 
ELSE DEST <-FALSE 


DOT 


IF SRCA (double-precision) > SRCB (double-precision) 

THEN DEST <-TRUE 
ELSE DEST <-FALSE 


SQRT 


DEST (single-precision, double-precision, extended-precision) 
<-SQRT[SRCA (single-precision, double-precision, extended-precision)] 


CONVERT 


DEST (integer, single-precision, double-precision) 
<-SRCA (integer, single-precision, double-precision) 


CLASS 


DEST (single-precision, double-precision, extended-precision) 
<-CLASS[SRCA (single-precision, double-precision, extended-precision)] 



Figure 42. Floating-Point Instructions 
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Mnemonic 


Operation Description 


CALL 


DEST <-PC//00 + 8 
PC <-TARGET 
Execute delay instruction 


CALLI 


DEST <-PC//00 + 8 

PC<-SRCB 

Execute delay instruction 


JMP 


PCoTARGET 
Execute delay instruction 


JMPI 


PC<-SRCB 

Execute delay instruction 


JMPT 


IF SRCA = TRUE THEN PC oTARGET 
Execute delay instruction 


JMPTI 


IF SRCA = TRUE THEN PC <-SRCB 
Execute delay instruction 


JMPF 


IF SRCA = FALSE THEN PC <-TARGET 
Execute delay instruction 


JMPFI 


IF SRCA = FALSE THEN PC <-SRCB 
Execute delay instruction 


JMPFDEC 


IF SRCA = FALSE THEN 
SRCAoSRCA-1 
PC<-TARGET 

ELSE 

SRCA <-SRCA -1 

Execute delay instruction 



Figure 43. Branch Instructions 



Mnemonic 


Operation Description 


CLZ 


Determine number of leading zeros in a word 


SETIP 


Set IPA, IPB, and IPC with operand register numbers 


EMULATE 


Load IPA and IPB with operand register numbers, and Trap (VN) 


INV 


Reset all Valid bits in Branch Target Cache to zeros 


IRET 


Perform an interrupt return sequence 


IRETINV 


Perform an interrupt return sequence, and reset all Valid bits 
in Branch Target Cache to zeros 


HALT 


Enter Halt mode on next cycle 



Figure 44. Misceiianeous Instructions 
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DATA FORMATS AND HANDLING 

Tiiis section describes thie various data types supported 
by the Am29000, and tlie mechanisms for accessing 
data in external devices and memories. The Am29000 
includes provisions for the external access of bytes, 
half-words, unaligned words, and unaligned half-words, 
as described in this section. 

Integer Data Types 

Most Am29000 instructions deal directly with word- 
length integerdata; integers may be either signed or un- 
signed, depending on the instruction. Some instructions 
(e.g., AND) treat word-length operands as strings of 
bits. In addition, there is support for character, half- 
word, and Boolean data types. 

Byte Operations 

The processor supports character data through load, 
store, extraction, and insertion operations on word- 
length operands, and by a compare operation on byte- 
length fields within words. The fomiat for unsigned and 
signed characters Is shown in Figure 45; for signed 
characters, the sign bit is the most-significant bit of the 
character. For sequences of packed characters within 
words, bytes are ordered either left-to-right or right-to- 
left, depending on the BO bit of the Configuration Regis- 
ter (see Special Floating-Point Values section). 

If the Data Width Enable (DW) bit of the Configuration 
Register is 1 , the Am29000 is enabled to load and store 
byte data. On a load, an external packed byte is con- 
verted to one of the character formats shown in 
Figure 45. On a store, the low-order byte of a word is 
packed into every byte of an extemal word. The External 
Data Accesses section describes external byte ac- 
cesses in more detail. 

The Extract Byte (EXBYTE) instruction replaces the 
low-order character of a destination word with an arbi- 
trary byte-aligned characterfrom a source word. Forthe 
EXBYTE instmction, the destination word can be a zero 
word, which effectively zero-extends the characterfrom 
the source operand. 

The Insert Byte (INBYTE) instruction replaces an arbi- 
trary byte-aligned character in a destination word with 



the low-order character of a source word. For the IN- 
BYTE instruction, the source operand can be a charac- 
ter constant specified by the instruction. 

The Compare Bytes (CPBYTE) instruction compares 
two word-length operands and gives a result of True if 
any corresponding bytes within the operands have 
equivalent values. This allows programs to detect char- 
acters within words without first having to extract individ- 
ual characters, one at a time, from the word of interest. 

Half-Word Operations 

The processor supports half-word data through load, 
store, insertion, and extraction operations on word- 
length operands. The format for unsigned and signed 
half-words is shown in Figure 46; for signed half-words, 
the sign bit is the most-significant bit of the half-word. 
For sequences of packed half-words within words, half- 
words are ordered either left-to-right or right-to-left, de- 
pending on the Byte Order (BO) bit of the Configuration 
Register (see Addressing and Alignment section). 

If the Data Width Enable (DW) bit of the Configuration 
Register is 1 , the Am29000 is enabled to load and store 
half-word data. On a load, an external packed half-word 
is converted to one of the formats shown in Figure 46. 
On a store, the low-order half-word of a word is packed 
into every half-word of an external word. 

The Extract Half-Word (EXHW) instruction replaces the 
low-order half-word of a destination word with either the 
low-order or high-order half-word of a source word. For 
the EXHW instruction, the destination word can be a 
zero word, which effectively zero-extends the half-word 
from the source operand. 

The Extract Half-Word, Sign-Extended (EXHWS) in- 
struction is similar to the EXHW instruction, except that 
it sign-extends the half-word in the destination word 
(i.e., it replaces the most-significant 1 6 bits of the desti- 
nation word with the most-significant bit of the source 
half-word). 

The Insert Half-Word (INHW) instruction replaces either 
the low-order or high-order half-word in a destination 
word with the low-order half-word of a source word. 



Unsigned : 

31 23 15 


7 





000000000000000000000000 


1 1 1 

data 


Signed: 

31 23 15 7 


ssssssssssssssssssssssss 


s 


II II II 

data 



Figure 45. Character Format 
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Unsigned: 
31 


23 


15 


7 





0000000000 


1 1 

data 


Signed: 

31 


23 


15 7 


s s s s s s 


ssssssssss 


s 


II 1 1 1 1 1 

data 



Figure 46. Half-Word Format 



Boolean Data 

Some instaictions in the Compare class generate word- 
length Boolean results. Also, conditional branches are 
conditional upon Boolean operands. The Boolean for- 
mat used by the processor is such that the Boolean 
values True and False are represented by a 1 or 0, 
respectively, in the most-significant bit of a word. The 
remaining bits are unimportant; for the compare inslmc- 
tions, they are reset. Note that twos-complement 
negative integers are indicated by the Boolean value 
True in this encoding scheme. 

Floating-Point Data Types 

The Am29000 defines single- and double-precision 
floating-point formats that comply with the IEEE Stan- 
dard for Binary Floating-Point Arithmetic (ANSI/IEEE 
Std. 754-1985). These data types are not supported di- 
rectly in processor hardware, but can be implemented 
by a virtual floating-point interface provided in the 
Am29000. 

In this section, the following nomenclature is used to de- 
note fields in a floating-point value: 

■ s: sign bit 

■ bexp: biased exponent 

■ frac: fraction 

■ sig: significand 

Single-Precision Floating-Point 

The format for a single-precision floating-point value is 
shown in Figure 47. 



Typically, the value of a single-precision operand is ex- 
pressed by: 

(-1)"s * Lfrac * 2" (bexp -127). 



The encoding of special floating-point values is given in 
the Special Floating-Point Values section. 

Doubie-Precision Floating-Point 

The format for a double-precision floating-point value is 
shown in Figure 48. 

Typically, the value of a double-precision operand is ex- 
pressed by: 

(-1)"s* 1.frac*2"(bexp-1023). 

The encoding of special floating-point values is given in 
the Special Floating-Point Values section. 

In order to be properly referenced by a floating-point 
instruction, a double-precision floating-point value must 
be double-word aligned. The absolute register number 
of the register containing the first word (labeled "0" in 
Figure 48) must be even. The absolute register number 
of the register containing the second word (labeled "1 " in 
Figure 48) must be odd. If these conditions are not met, 
the results of the instruction are unpredictable. Note that 
the appropriate registers for a double-precision value 
in the local registers depend on the value of the Stack 
Pointer. 
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Figure 47. Single-Precision Floating-Point Format 
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Figure 48. Double-Precision Floating-Point Format 



Special Floating-Point Values 

The Am29000 defines floating-point values that are en- 
coded for special interpretation. The values are de- 
scribed in this section. 

Not-a-Number 

A Not-a-Number (NaN) is a symbolic value used to re- 
port certain floating-point exceptions. It also can be 
used to implement user-defined extensions to floating- 
point operations. A NaN comprises a floating-point num- 
ber with maximum biased exponent and non-zero frac- 
tion. The sign bit can be either or 1 and has no signifi- 
cance. There are two types of NaN: signaling NaNs and 
quiet NaNs. A signaling NaN causes an Invalid Opera- 
tion exception if used as an input operand to a floating- 
point operation; a quiet NaN does not cause an excep- 
tion. The Am29000 distinguishes signaling and quiet 
NaNs by the most-significant bit of the fraction: a 1 indi- 
cates a quiet NaN, and a indicates 2 signaling NaN. 

An operation never generates a signaling NaN as a re- 
sult. A quiet NaN result can be generated in one of two 
ways: 

■ as the result of an invalid operation that can- 
not generate a reasonable result, or 

■ as the result of an operation for which one or 
more input operands are either signaling or 
quiet NaNs. 

In either case, the Am29000 produces a quiet NaN hav- 
ing a fraction of 1 1 000 ... 0; that is, the two nfX)St-signifi- 
cant bits of the fraction are 1 1 , and the remaining bits are 
0. If desired, the Reserved Operand exception can be 
enabled to cause a Floating-Point Exception trap. The 
trap handler in this case can implement a scheme 
whereby user-defined NaN values appear to pass 
through operations as results, providing overall status 
for a series of operations. 

Infinity 

Infinity is an encoded value used to represent a value 
that is too large to be represented as a finite number in 
a given floating-point format. Infinity comprises a float- 
ing-point number with maximum biased exponent and 
zero fraction. The sign bit of an infinity distinguishes +°° 
from -oo. 



Denormaiized Numbers 

The IEEE Standard specifies that, wherever possible, a 
result that is too small to be represented as a normalized 
number be represented as a denormaiized number. A 
denormaiized number may be used as an input operand 
to any operation. For single- and double-precision for- 
mats, a denomialized number comprises a floating- 
point number with a biased exponent of and a non- 
zero fraction field; the sign bit can be either 1 or 0. The 
value of a denormaiized number is expressed by: 

(-1)"s * O.frac* 2**(-bias + 1), 



where "bias" is the exponent bias for the format in 
question. 

Zero 

A zero comprises a floating-point number with a biased 
exponent of and a zero fraction field. The sign bit of a 
zero can be either or 1 ; however, positive and negative 
zero are both exactly zero, and are considered equal by 
comparison operations. 

External Data Accesses 

All processor external accesses occur between 
general-purpose registers and external devices and 
memories. Accesses occur as the result of the execu- 
tion of load and store instructions. The load and store in- 
structions specify which general-purpose register re- 
ceives the data (for a load) or supplies the data (for a 
store). The format of the load and store instructions is 
shown in Figure 49. 

Addresses for accesses are given either by the content 
of a general-purpose register or by a constant value 
specified by the load or store instruction. The load and 
store instructions do not perform address computation 
directly. Any required address computations are per- 
formed explicitly by other instructions. 

In the load or store instmction, the Coprocessor Enable 
(CE) bit (bit 23) determines whether or not the access is 
directed to the coprocessor. If the CE bit is 0, the access 
is directed to an external device or memory. If the CE bit 
is 1 , data is transferred to or from the coprocessor. The 
CE bit affects the interpretation of the Control (CNTL) 
field as well as the channel protocol. This section deals 
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Figure 49. Load/Store Instruction Format 



with all external accesses other than coprocessor 
accesses. 

The format of the instructions that do not perform 
coprocessor data transfers {i.e., in which the CE bit is 0) 
is shown in Figure 50. 

In load and store instructions, the "RB or T'field specifies 
the address for access. The address is either the con- 
tent of a general-purpose register, with register number 
RB, or a constant with a value I (zero-extended to 32 
bits). The M bit determines whether the register or the 
constant is used. 

The data for the access is written into the general- 
purpose register RA for a load, and is supplied by regis- 
ter RA for a store. 

The definitions for other fields in the load or store in- 
struction are given below: 

Bit 23: Coprocessor Enable (CE)— The CE bit is for 
a non-coprocessor load or store. 

Bit 22: Address Space (AS)— If the AS bit is for an 
untranslated load or store, the access is directed to in- 
struction/data memory. If the AS bit is 1 for an untrans- 
lated load or store, the access is directed to input/output. 
The AS bit must be for a translated load or store; if the 
AS bit is 1 for a translated load or store, a Protection Vio- 
lation trap occurs. The address space for a translated 
load or store is determined by the Input/Output (10) bit of 
the associated TLB entry. 

Bit 21: Physical Address (PA)— The PA bit may be 
used by a Supervisor-mode program to disable address 
translation for an access. If the PA bit is 1 , then address 
translation is not performed for the access, regardless of 
the value of the Physical Addressing/Data (PD) bit in the 



Current Processor Status Register. If the PA bit is 0, ad- 
dress translation depends on the PD bit. 

The PA bit may be 1 only for Supen/isor-mode instmc- 
tions. If it is 1 for a User-mode instruction, a Protection 
Violation trap occurs. 

Bit 20: Set Byte Pointer/Sign Bit (SB)— If the Data 
Width Enable (DW) bit of the Configuration Register is 
and the SB bit is 1 , the Byte Pointer Register is written 
with the two least-significant bits of the address for the 
access. These address bits can control subsequent 
character and half-word operations. If the BP bit is 0, the 
Byte Pointer Register is not affected. 

If the Data Width Enable (DW) bit of the Configuration 
Register is 1 and the SB bit is 1 for a load, the loaded 
byte or half-word is sign-extended in the destination reg- 
ister; if the SB bit is 0, the byte or half-word is zero-ex- 
tended. If the DW bit is 1 and the SB bit is 1 for either a 
load or store, then each bit of the Byte Pointer Register 
is written with the complement of the Byte Order bit of 
the Configuration Register. The Byte Pointer Register is 
set in this case to provide software compatibility across 
different types of memory systems. If the SB bit is 0, the 
Byte Pointer Register is not affected. 

Bit 19: User Access (UA)— The UA bit allows pro- 
grams executing in the Supen/isor mode to emulate 
User-mode accesses. This allows checi<ing of the 
authorization of an access requested by a User-mode 
program. It also causes address translation (if applica- 
ble) to be performed using the PID field of the MMU 
Configuration Register, rather than the fixed Supervi- 
sor-mode process identifier zero. 

If the U A bit is 1 for a Supervisor-mode load or store, the 
access associated with the instruction is performed in 
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Figure 50. Non-Coprocessor Load/Store Format 
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the User mode. In this case, the User mode affects only 
TLB protection checking, the SUP/US output, and the 
use of the PID field in translation; it has no effect on the 
registers that can be accessed by the instruction. If the 
UA bit is 0, the program mode for the access is con- 
trolled by the SM bit. 

If the UA bit is 1 for a User-mode load or store, a Protec- 
tion Violation trap occurs. 

Bits 18-16: Option (OPT)— This field is placed on the 
OPTs-OPTo outputs during the address cycle of the ac- 
cess. There Is a one-to-one correspondence between 
the OPT field and the OPTz-GPTo outputs; that is, the 
most-significant OPT bit is placed on OPTz, and so on. 

The OPT field controls system functions as described 
below. 

Bits 15-8: (RA)— The data for the access is written into 
the general-purpose register RA for a load, and is sup- 
plied by register RA for a store. 

Bits 7-0: (RB or I)— In load and store instmctions, the 
"RB or I" field specifies the address for the access. The 
address is either the content of a general-purpose reg- 
ister with register number RB, or a constant value I 
(zero-extended to 32 bits). The M bit of the operation 
code (bit 24) determines whether the register or the con- 
stant is used. 

Load and store operations are overlapped with the exe- 
cution of instructions that follow the load or store instruc- 
tion. Only one load or store may be in progress on any 
given cycle. If a load or store instruction is encountered 
while another load or store operation is in progress, the 
processor enters the Pipeline Hold nx)de until the first 
operation is completed. However, the address for the 
second operation may appear on the address bus if the 
first operation is to a device or menwry that supports 
pipelined operations (see Pipelined Accesses section). 

Load Operations 

The processor provides the following instmctions for 
performing load operations: Load (LOAD), Load and 
Lock (LOADL), Load and Set (LOADSET). and Load 
Multiple (LOADM). Allot these instmctions transfer data 
from an external device or memory into one or more 
general-purpose registers. 

The LOADL instmction supports the implementation of 
device and memory interl ocks in a multiprocessor con- 
figuration. It activates the LOCK output during the ad- 
dress cycle of the access. 

The LOADSET instruction implements a binary sema- 
phore. It loads a general-purpose register and automati- 
cally writes the accessed location with a word that has 1 
in every bit po sition ( that is, the write is indivisible from 
the read). The LOCK output is asserted during both the 
read and write accesses. Note that, if address transla- 
tion is enabled for the LOADSET instmction, the TLB 
memory-protection bits must allow both the read and 



write accesses. If either the read or write access is not 
allowed, neither access is performed. 

The LOADM loads a specified number of registers from 
sequential addresses, as explained below. 

Load operations are overlapped with the execution of in- 
stmctions that follow the load instmction. The processor 
detects any dependencies on the loaded data that sub- 
sequent instmctions may have, and, if such a depen- 
dency is detected, enters the Pipeline Hold mode until 
the data are returned by the external device or memory. 
If a register that is the target of an incomplete load is 
written with the result of a subsequent instmction, the 
processor does not write the returning data into the reg- 
ister when the load is completed; the Not Needed (NN) 
bit in the Channel Control Register is set in this case. 

Store Operations 

The processor provides the following instmctions for 
performing store operations: Store (STORE), Store and 
Lock (STOREL), and Store Multiple (STOREM). All of 
these instmctions transfer data from one or more 
general-purpose registers to an external device or 
memory. 

The STOREL instmction supports the implementation 
of device and memory interl ocks in a multiprocessor 
configuration. It activates the LOCK output during the 
address cycle of the access. 

The STOREM instmction stores a specified number of 
registers to sequential addresses, as explained below. 

Store operations are overlapped with the execution of 
instmctions that follow the store instmction. However, 
no data dependencies can exist since the store prevents 
any subsequent accesses until it is completed. 

Multiple Accesses 

Load Multiple (LOADM) and Store Multiple (STOREM) 
instmctions nnove contiguous words of data between 
general-purpose registers and external devices and 
memories. The numberof transfers is determined by the 
Load/Store Count Remaining Register. 

The Load/Store Count Remaining (CR) field in the Load/ 
Store Count Remaining Register specifies the number 
of transfers to be performed by the next LOADM or 
STOREM executed in the instmction sequence. The CR 
field is in the range of to 255 and is zero-based; a count 
value of represents one transfer, and a count value of 
255 represents 256 transfers. The CR field also appears 
in the Channel Control Register. 

Before a LOADM or STOREM is executed, the CR field 
is set by a Move To Special Register. A LOADM or 
STOREM uses the most recently written value of the CR 
field. If an attempt is made to alter the CR field and the 
Channel Control Register contains information for an 
external access that has not yet been completed, the 
processor enters the Pipeline Hold mode until the 
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access is completed. Note that since tiie CR is set inde- 
pendently of the LOADM and STOREM, the CR field 
may represent a valid state of an interrupted program 
even if the Contents Valid (CV) bit of the Channel 
Control Register is 0. 

Because of the pipelined implementation of LOADM 
and STOREM, at least one instruction (e.g., the instmc- 
tion that sets the CR field) must separate two succes- 
sive LOADM and/or STOREM instructions. 

After the CR field is set, the execution of a LOADM or 
STOREM begins the data transfer. As with any other 
load or store operation, the LOADM or STOREM waits 
until any pending load or store operation is complete 
before starting. The LOADM instmction specifies 
the starting address and starting destination general- 
purpose register. The STOREM instruction specifies the 
starting address and the starting source general- 
purpose register. 

During the execution of the LOADM or STOREM 
instruction, the processor updates the address and reg- 
ister number after every access, incrementing the 
address by 4 and the register number by 1 . This contin- 
ues until either all accesses are completed or an inter- 
rupt or trap is taken. 

For a Load Multiple or Store Multiple address sequence, 
addresses wrap from the largest possible value (hexa- 
decimal FFFFFFFC) to the smallest possible value 
(hexadecimal 00000000). 

The processor increments absolute register numbers 
duringthe Load Multiple or Store Multiple sequence. Ab- 
solute register numbers wrap from 1 27 to 1 28, and from 
255 to 128. Thus, a sequence that begins in the global 
registers may make a transition to the local registers, but 
a sequence that begins in the local registers remains in 
the local registers. Also, note that the local registers are 
addressed circularly. 

The normal restrictions on register accesses apply for 
the Load Multiple and Store Multiple sequences. For ex- 
ample, if a protected general-purpose register is en- 
countered in the sequence for a User-mode program, a 
Protection Violation trap occurs. 

Intermediate addresses are stored in the Channel Ad- 
dress Register, and register numbers are stored in the 
Target Register (TR) field of the Channel Control Regis- 
ter. For the STOREM instruction, the data for every 
access is stored in the Channel Data Register (this 
register also is set during the execution of the LOADM 
instruction, but has no interpretation in this case). The 
CR field is updated on the completion of every access so 
that it indicates the number of accesses remaining in the 
sequence. 

Load Multiple and Store Multiple operations are indi- 
cated by the Multiple Operation (ML) bit in the Channel 



Am29000 

Control Register. This bit may be 1 even though the CR 
field has a value of (indicating that one transfer 
remains to be performed). The ML bit is used to restart a 
multiple operation on an interrupt return; if it is set 
independently by a Move To Special Register before a 
load or store instruction is executed, the results are 
unpredictable. 

While a multiple load or store is executing, the processor 
is in the Pipeline Hold mode, suspending any subse- 
quent instruction execution until the multiple access is 
completed. If an intermpt or trap is taken, the Channel 
Address, Channel Data, and Channel Control registers 
contain the state of the multiple access at the point of in- 
terruption. The multiple access may be resumed at this 
point, at a later time, by an interrupt return. 

The processor attempts to complete multiple accesses 
using the burst-mode capability of the channel (see 
Burst-Mode Accesses section). Forthis reason, multiple 
accesses of individual bytes and half-words are not sup- 
ported. If the burst-mode access is preempted, the pro- 
cessor retransmits the address at the point of preemp- 
tion. If the external device or memory cannot support 
burst-mode accesses, the processor transmits an ad- 
dress for every access. If the address sequence causes 
a virtual page-boundary crossing, the processor 
preempts the burst-mode access, translates the ad- 
dress for the new page, and reestablishes the burst- 
mode access using the new physical address. 

The last load or store is executed as a simple access. 
The processor will preempt burst-mode transfer imme- 
diately prior to the last word of the transfer. 

Option Bits 

The Option field in the load and store instmctions sup- 
ports system functions, such as byte and half-word ac- 
cesses. The definition of this field for a load or store, de- 
pending on the AS bit of the instajction, is as follows: 
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Word-length access 
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Instruction ROM 
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access (as data) 
Cache control 
ADAPT29K accesses 
Reserved 



Note that some of these encodings do not affect proces- 
sor operation, and could have other interpretations in a 
particular system. For example, the OPT values 000, 
001 , and 010 affect processor operation only if the DW 
bit of the Configuration Register is 1. However, non- 
standard uses of the OPT field have an implication on 
the portability of software between different systems. 
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Addressing and Alignment 

Address Spaces 

External instructions and data are contained in one of 
four 32-bit address spaces: 

1. Instruction/Data Memory 

2. Input/Output 

3. Coprocessor 

4. Instruction Read-Only Memory (Instruction 
ROM). 

An address in the instruction/data memory addi'ess 
space may be treated as virtual or physical, as deter- 
mined by the Current Processor Status Register. Ad- 
dress translation for data accesses Is enabled sepa- 
rately from address translation for instruction accesses. 
A program in the Supervisor mode may temporarily dis- 
able address translation for individual loads and stores; 
this permits load-real and store-real operations. 

It is possible to partition physical instruction and data ad- 
dresses into two separate physical address spaces. 
However, virtual instruction and data addresses appear 
in the same virtual address space (i.e., insttuction/data 
memory). 

The coprocessor address space is not an address 
space in the strictest sense. The coprocessor address 
space is defined so that transfers of operands and op- 
eration codes to the coprocessor do not interfere with 
other external devices and memories. 

The processor does not directly support the access of 
the instruction ROM address space using loads and 
stores; this capability is defined as a system option re- 
quiring external hardware. 

For untranslated data accesses, bits contained in load 
and store instmctions distinguish between the instruc- 
tion/data memory, input/output, and coprocessor ad- 
dress spaces. For translated data accesses, the Input/ 
Output bit of the associated TLB entry distinguishes 
between the instruction/data memory and input/output 
address spaces. 

For instruction fetches, the ROM Enable (RE) bit of the 
Current Processor Status Register distinguishes be- 
tween the instruction/data and instruction ROM address 
spaces. 

Byte and Half- Word Addressing 

The Am29000 generates word-oriented byte addresses 
for accesses to external devices and memories. Ad- 
dresses are word-oriented because loads, stores, and 
instruction fetches access words. However, addresses 
are byte addresses because they are sufficient to select 
bytes packed within accessed words. For load and store 
operations, the processor provides means for using the 
least-significant address bits to access bytes and half- 
words within external words. 



The selection of a byte within a word is determined by 
the two least-significant bits of an address and the Byte 
Order (BO) bit of the Configuration Register. The selec- 
tion of a half-word within a word is determined by the 
next-to-least-significant bit of an address and the BO bit. 
Figure 51 illustrates the addressing of bytes and half- 
words when the BO bit is 0, and Figure 52 illustrates the 
addressing of bytes and half-words when the BO bit is 1 . 
In Figure 51 and Figure 52, addresses are represented 
In hexadecimal notation. 

In the processor, the two least-significant bits of an ex- 
ternal address can be reflected in the Byte Pointer (BP) 
field of the ALU Status Register when the DW bit of the 
Configuration Register is 0. Alternatively, the two least- 
significant bits of the address can be used to control byte 
and half-word accesses when the DW bit is 1 . The BO bit 
affects only the interpretation of the BP field and the two 
least-significant address bits. 

If the BO bit is 0, bytes are ordered within words such 
that a 00 in the BP field or in the two least-significant ad- 
dress bits selects the high-order byte of a word, and a 1 1 
selects the low-order byte. If the BO bit is 1 , a 00 in the 
BP field or in the two least-significant address bits se- 
lects the low-order byte of a word, and a 1 1 selects the 
high-order byte. 

If the BO bit is 0, half-words are ordered within words 
such that a in the nrost-significant bit of the BP field or 
the next-to-least-signif leant address bit selects the high- 
order half-word, and a 1 selects the low-order half-word. 
If the BO bit is 1 , a in the most-significant bit of the BP 
field or the next-to-least-significant address bit selects 
the low-order half-word of a word, and a 1 selects the 
high-order half-word. Note that since the least-signifi- 
cant bit of the BP field or an address does not participate 
in the selection of half-words, the alignment of half- 
words is forced to half-word boundaries in this case. 

Alignment of Words and Half-Words 

Since only byte addressing is supported, it is possible 
that an address for the access of a word or half-word is 
not aligned to the desired word or half-word. The 
Am29000 either ignores or forces alignment in most 
cases. However, some systems may require that un- 
aligned accesses be supported for compatibility rea- 
sons. Because of this, the Am29000 provides an option 
that creates a trap when a nonaligned access is at- 
tempted. This trap allows software emulation of the non- 
aligned accesses in a manner that is appropriate for the 
particular system. 

The detection of unaligned accesses is activated by a 1 
in the Trap Unaligned Access (TU) bit of the Current 
Processor Status Register. Unaligned access detection 
is based on the data length as indicated by the OPT field 
of a load or store instruction, and on the two least-signifi- 
cant bits of the specified address. Only addresses for 
instruction/data memory accesses are checked; align- 
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I I I I I M I M M I I I I I I I I I I I I I I I I I I I 

Word 00000000 
Half-Word 00000000 Half-Word 00000002 

Byte 00000000 Byte 00000001 Byte 00000002 Byte 00000003 

I I I I I I I I I I 11 I I I I I I I I I I I I I I I I M I 

Word 00000004 
Half-Word 00000004 Half-Word 00000006 

Byte 00000004 Byte 00000005 Byte 00000006 Byte 00000007 



WordFFFFFFFS 
Half-Word FFFFFFF8 Half-Word FFFFFFFA 

ByteFFFFFFFS Byte FFFFFFF9 Byte FFFFFFFA Byte FFFFFFFB 

II II I I II I I II II M I II I I I I I I I II II 

WordFFFFFFFC 
Half-Word FFFFFFFC Half-Word FFFFFFFE 

ByteFFFFFFFC Byte FFFFFFFD Byte FFFFFFFE Byte FFFFFFFF 

Figure 51 . Byte and Half-Word Addressing with BO = 
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Word 00000000 



Half-Word 00000002 Half-Word 00000000 

Byte 00000003 Byte 00000002 Byte 00000001 Byte 00000000 

I I I I I M II I M II I M M M M M M II I 

Word 00000004 
Half-Word 00000006 Half-Word 00000004 

Byte 00000007 Byte 00000006 Byte 00000005 Byte 00000004 



TT 



Half-Word FFFFFFFA 
Byte FFFFFFFB Byte FFFFFFFA 



Word FFFFFFFB 



Half-Word FFFFFFFB 
Byte FFFFFFF9 Byte FFFFFFFB 



m 



TTT 



Half-Word FFFFFFFE 
Byte FFFFFFFF Byte FFFFFFFE 



WordFFFFFFFC 



Half-Word FFFFFFFC 
Byte FFFFFFFD Byte FFFFFFFC 



Figure 52. Byte and Haif-Word Addressing with BO r 1 
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ment is ignored for input/output accesses and copro- 
cessor transfers. 

An Unaligned Access trap occurs only if the TU bit is 1 
and any of the following combinations of OPT field and 
address bits is detected for a load or store to instruction/ 
data memory: 

OPT, opt] op?;; a, a, ~ 



Unaligned 

1 word access 
1 



1 Unaligned 

1 1 half-word access 



The trap handler for the Unaligned Access trap is 
responsible for generating the correct sequence of 
aligned accesses and perfonning any necessary shift- 
ing, masking and/or merging. Note that a virtual page- 
boundary crossing also may have to be considered. 

Alignment of instructions 

In the Am29000, all instructions are 32 bits in length, and 
are aligned on word-address ix)undaries. The proces- 
sor's Program Counter is 30 bits in length, and the least- 
significant 2 bits of processor-generated instruction ad- 
dresses are always 00. An unaligned address can be 
generated by indirect jumps and calls. However, align- 
ment is ignored by the processor in this case, and it ex- 
pects the system to force alignment (i.e., by interpreting 
the two least-significant address bits as 00, regardless 
of their values). 

Accessing Instructions as Data 

To aid the external access of instructions and data on 
separate buses, the processor distinguishes between 
instruction and data accesses. However, it does not 
support a logical distinction between instmction and 
data address spaces (except in the case of instruction 
read-only menrwry). In particular, address translation in 
the Memory Management Unit is in no way affected by 
this distinction (although memory protection is). 

In systems where it is necessary to access instoictions 
as data, this function should be performed via the 
shared address space. The OPT field provides a means 
for loads to access instnjctions in the instruction read- 
only memory (ROM) address space. The Am29000 
does not take any action to prevent a store to the instaic- 
tion ROM address space. 

Byte and Half- Word Accesses 

The Am29000 can perform byte and half-word accesses 
in either software or hardware under control of the Data 
Width Enable (DW) bit of the Configuration Register. 
Software byte and half-word accesses are selected by a 
DW bit of 0, and hardware byte and half-word accesses 
are selected by a DW bit of 1 . Software byte and half- 
word accesses are less efficient than hardware byte and 



half-word accesses, but hardware accesses require that 
the system be able to selectively write individual byte 
and half-word positions within external devices and 
memories. The software-only technique is compatible 
with systems designed to provide hardware support for 
byte and half-word accesses. 

This section describes the operation of both software 
and hardware byte and half-word accesses. Byte and 
half-word accesses operate as described here for mem- 
ory and input/output accesses, but not for coprocessor 
transfers. Coprocessor transfers are unaffected by the 
DW bit. 

The DW bit is cleared by a processor reset. It must ex- 
plicitly be set to 1 by software before hardware byte and 
half-word accesses can be performed. 

Software Byte and Haif-Word Accesses 

If the DW bit is 0, the Am29000 allows the Byte Pointer 
Register to be set with the least-significant bits of an ad- 
dress specified by any load or store instruction, except 
those that transfer information to and from the coproces- 
sor. Insert and extract instructions can then be used to 
access the byte or half-word of interest, after the exter- 
nal-word has been accessed. This provides a general- 
purpose mechanism for manipulating external byte and 
half-word data, without the need for external hardware 
support. 

To load a byte or half-word, a word load is first per- 
formed. This load sets the BP field with the two least- 
significant bits of the address. A subsequent EXBYTE, 
EXHW, or EXHWS instmction extracts the byte or half- 
word of interest from the accessed word. 

To store a byte or half-word, a load is first perfonned, 
setting the BP field with the two least-significant bits of 
the address. A subsequent INBYTE or INHW instruction 
inserts the byte or half-word of interest into the accessed 
word, and the resulting word is then stored. 

Software that relies on loads and stores setting the BP 
field cannot operate correctly when the Freeze (FZ) bit 
of the Current Processor Status Register is 1 , because 
the ALU Status Register is frozen. 

Hardware Byte and Haif-Word Accesses 

If the DW bit is 1 on a load, the Am29000 selects a byte 
or half-word from the loaded word depending on the Op- 
tion (OPT) bits of the load instruction, the Byte Order 
(BO) bit of the Configuration Register, and the two least- 
significant bits of the address (for bytes) or the next-to- 
least-significant bit of the address (for half-words). The 
selected byte or half-word is right-justified within the 
destination register, if the SB bit of the load instmction is 
0, the remainder of the destination register is zero- 
extended. If the SB bit is 1, the remainder of the destina- 
tion register is sign-extended with the sign bit of the se- 
lected byte or half-word. 

If the DW bit is 1 on a store, the Am29000 replicates the 
low-order byte or half-word in the source register into 
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every byte and half-word position of tfie stored word. 
The system is responsible for generating the appropri- 
ate byte and/or half-word strobes, based on the OPTz- 
OPTo signals and the two least-significant bits of the ad- 
dress, to write the appropriate byte or half-word in the 
selected device or memory (the system byte order must 
also be considered). The SB bit does not affect the op- 
eration of a store, except for setting the BP field as de- 
scribed below. 

If the SB bit is 1 for either a load or store and the DW bit is 
also 1 , txjth bits of the BP field are set to the complement 
of the BO bit when the load or store is executed. This 
does not directly affect the load or store access, but 
supports compatibility for software developed for word- 
write-only systems. Hardware byte and half-word 
accesses— in contrast to software byte and half-word 
accesses — can be performed when the FZ bit is 1 , be- 
cause these accesses do not rely on the BP field. 

System Alternatives and Compatibility 

The two mechanisms for performing byte and half-word 
accesses create the possibility of two types of systems. 
These are named for convenience: 

° Type 1 : simple, word-only accesses in exter- 
nal devices and memories; software byte and 
half-word accesses. 

■* Type 2: byte/half-word strobes in external de- 
vices and memories; hardware byte and half- 
word accesses by the Am29000. 

The provision for hardware byte and half-word accesses 
encourages Type 2 systems. Software for Type 1 sys- 
tems can execute on Type 2 systems, but the reverse is 
not true. Software compatibility is possible primarily be- 
cause of the DW bit and because the Am29000 sets the 
BP field with an appropriate byte pointer even when it 
performs byte and half-word accesses with internal 
hardware. Also, the system must return a full word in 
either type of system, regardless of the access data- 
width. The DW bit must be in Type 1 systems and must 
be 1 in Type 2 systems. To illustrate compatibility be- 
tween systems, consider the following steps of an un- 
signed byte load compiled for a Type 1 system, but exe- 
cuting on a Type 2 system: 

1. Perform a load with OPT = 001 andSB=1. 

' Type 1 system: The addressed word is ac- 
cessed and placed into the destination regis- 
ter. The BP field is set with the two least-sig- 
nificant bits of the address. 

■ Type 2 system: The addressed byte is ac- 
cessed, aligned, padded, and placed into the 
destination register. The BP field is set to point 
to the low-order byte, reflecting the alignment 
that has been performed (the pointer depends 
on the value of the BO bit). 
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2. Perform a byte extract on the loaded word. 

■ Type 1 system: The byte selected by the BP 
field is aligned to the low-order byte of the des- 
tination register and the remainderof the word 
is zero-extended. The selected byte may be in 
any byte position. 

•" Type 2 system: The byte selected by the BP 
field (set to point to the low-order byte) is 
aligned to the low-order byte of the destination 
register and the remainderof the word is zero- 
extended. (Note that the selected byte was al- 
ready in the low-order byte position. This op- 
eration does not change the program state 
but merely allows software compatibility.) 

The recommended instruction sequences for all types of 
byte and half-word accesses and for both types of sys- 
tems are enumerated below. Compatibility between 
these systems follows the above example, but for brev- 
ity, compatibility is not described in detail here. 

Byte read, unsigned: 

Typ e1 Comments 

load 0,17,temp, add r ;OPT = 001, SB = 1 
exbyte temp,temp,0 ; get byte 



Type 2 

load 0,1, temp,addr 



Comments 

;OPT=001,SB = 



Byte read, signed: 
Typg1 



Comments 

;OPT=001,SB = 1 
; get byte 
; sign extend 



load 0,1 7,temp,addr 
exbyte temp,temp,0 
sll temp,temp,24 
sra temp,temp,24 

Typg2 

load 0,1 7,temp,addr ;OPT=001, SB = 1 
(sign extended) 



Comments 



Byte Write: 

Ty pel 



Comments 



Ioad0,17,temp,addr ;OPT=001. SB = 1 

inbyte temp.temp, ; insert byte 

data 

store 0,1 ,temp,addr ; store 



Type 2 

store 0,1, data,addr 



Comments 

;OPT = 001,SB = 
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Half-word read, unsigned: 
Ty pgl 



Comments 



Half-word write: 
Ty pgl 



load 0,1 S.temp.addr ;OPT=010, SB = 1 
extiw temp.temp.O ; get half-word un- 
signed 



Typ9 2 

load 0, 2 .temp, add r 



Comments 

;OPT=010,SB = 



Comments 



load 0,1 8,temp,addr ;OPT = 010, SB = 1 

iniiwtemp.temp.data ; insert half-word 

store 0,2,temp,addr ; store 

Type 2 Comments 

store 0,2,data,addr ; OPT =010, SB = 



Half-word read, signed: 
T ypQl 



Comments 



load 0,1 8,temp,addr ;OPT=010, SB = 1 
exhws temp.temp ; get half-word sign- 
extend 



T yp9 2 



Comments 



load 0,1 S.temp.addr ; OPT=010, SB = 1, 
(sign-extend) 
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INTERRUPTS AND TRAPS 

Interrupts and traps cause the Am29000 to suspend the 
execution of an instruction sequence and to begin the 
execution of a new sequence. The processor may or 
may not later resume the execution of the original in- 
struction sequence. 

The distinction between interrupts and traps is largely 
one of causation and enabling. Intenupts allow external 
devices and the Timer Facility to control processor exe- 
cution, and are always asynchronous to program execu- 
tion. Traps are intended to be used for certain excep- 
tional events that occur during instruction execution, 
and are generally synchronous to program execution. 

Throughout this manual, a distinction is made between 
the point at which an interrupt or trap occurs and the 
point at which it is taken. An interrupt or trap is said to 
occur when all conditions that define the interrupt or trap 
are met. However, an interrupt or trap that occurs is not 
necessarily recognized by the processor, either be- 
cause of various enables or because of the processor's 
operational mode (e.g.. Halt mode). An interaipt or trap 
is taken when the processor recognizes the interrupt or 
trap and alters its behavior accordingly. 

Interrupts 

Interrupts ar e caus e d by s ignals applied to any of the ex- 
ternal inputs INTFb-INTRo, or by the Timer Facility. The 
processor may be disabled from taking certain inter- 
mpts by the masking capability provided by the Disable 
All Interrupts and Traps (DA) bit. Disable Interrupts (Dl) 
bit, and interrupt Mask (IM) field in the Current Proces- 
sor Status Register. 

The DA bit disables all interrupts and most traps. The Dl 
bit disables external interrupts without affecting the rec- 
ognition of traps and Timer interrupts. The 2-bit IM field 
selectively enables external intermpts as follows: 



IM Value 



Result 



00 
01 
10 

1 1 



INTR q enable d 
INTR,-INTRo enabled 
INTR^-INTRq enabled 
INTR,-IN7Fio enabled 



Note that the INTRo interrupt cannot be disabled by the 
IM field. Also, note that no external interrupt is taken if 
either the DA or Dl bit is 1 . The Interrupt Pending bit in 
the Current P rocessor Stat us indicates that one or more 
of the signals INT[^3-INTRo is active, but that the corre- 
sponding interrupt is disabled due to the value of either 
DA, Dl.orlM. 

Traps 

Traps a re cau sed by signals applied to one of the inputs 
TRAP1-TRAP0, or by exceptional conditions such as 
protection violations. Except for the Instmction Access 
Exception, Data Access Exception, and Coprocessor 
Exception traps, traps are disabled by the DA bit in the 
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Current Processor Status; a 1 in the DA bit disables 
traps, and a enables traps. It is not possible to selec- 
tively disable individual traps. 

Wait Mode 

A wait-for-interrupt capability is provided by the Wait 
mode. The processor is in the Wait nrx)de whenever 
the Wait Mode (WM) bit of the Current Processor Status 
is 1 . While in Wait nrrade, the processor neither fetches 
nor executes instructions and performs no external 
accesses. The Wait mode is exited when an interrupt or 
trap is taken. 

Note that the processor can take only those interrupts or 
traps for which it is enabled, even in the Wait mode. For 
example, if the processor is in the Wait mode with a DA 
bit of 1 , it can lea ve the Wait mode only via the Reset 
mode or a WARN trap. 

Vector Area 

Interrupt and trap processing rely on the existence of a 
user-managed Vector Area in external instruction/data 
memory or instruction read-only memory (instruction 
ROM). The Vector Area begins at an address specified 
by the Vector Area Base Address Register, and pro- 
vides for as many as 256 different interrupt and trap han- 
dling routines. The processor reserves 24 routines for 
system operation and 40 routines for instruction emula- 
tion. The number and definition of the remaining 192 
possible routines are system-dependent. 

The Vector Area has one of two possible structures as 
determined by the Vector Fetch (VF) bit in the Configu- 
ration Register. The first structure, as described below, 
requires less external memory than the second, but 
imposes the performance penalty of the vector-table 
lookup. 

If the VF bit is 1 , the structure of the Vector Area is a ta- 
ble of vectors in instruction/data memory. The layout of 
a single vector is shown in Figure 53. Each vector gives 
the beginning word-address of the associated interaipt 
or trap handling routine, and specifies, by the R bit, 
whether the routine is contained in instruction/data 
memory (R = 0) or instruction ROM (R = 1). 

If the VF bit is 0, the structure of the Vector Area is a seg- 
ment of contiguous blocks of instructions in instruction/ 
data memory or instruction ROM. The ROM Vector Area 
(RV) bit of the Configuration Register determines 
whether the Vector Area is in instruction/data memory 
(RV = 0) or instruction ROM (RV = 1). A 64-instmction 
block contains exactly one interrupt or trap handling rou- 
tine, and blocks are aligned on 64-instruction address 
boundaries. 

Vector Numbers 

When an interrupt or trap is taken, the processor deter- 
mines an 8-bit vector number associated with the inter- 
rupt or trap. The vector number gives either the number 
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Figure 53. Vector Table Entry 



of a vector table entry or the number of an instruction 
block, depending on the value of the VF bit. 

If the VF bit is 1 , the physical address of the vector table 
entry is generated by replacing bits 9-2 of the value in 
the Vector Area Base Address Register with the vector 
number. 

If the VF bit is 0, the physical address of the first instruc- 
tion of the handling routine is generated by replacing bits 
15-6 of the value in the Vector Table Base Address 
Register with the vector number. 

Vector numbers are either predefined or specified by an 
instruction causing the trap. The assignment of vector 
numbers is shown in Figure 54 (vector numbers are in 
decimal notation). Vector numbers 64 to 255 are for use 
by trapping instmctions; the definition of the routines as- 
sociated with these numbers is system-dependent. 

Interrupt and Trap Handling 

Interrupt and trap handling consists of two distinct op- 
erations: taking the interrupt or trap, and returning from 
the interrupt or trap handler. If the intermpt or trap 
handler returns directly to the intemipted routine, the 
intermpt or trap handler need not save and restore 
processor state. 

Taidng an interrupt or Trap 

The following operations are perfomned in sequence by 
the processor when an interrupt or trap is taken: 

1. Instmction execution is suspended. 

2. Instmction fetching is suspended. 

3. Any in-progress load or store operation is com- 
pleted. Any additional operations are canceled 
in the case of Load Multiple and Store Multiple. 

4. The contents of the Current Processor Status 
Register are copied into the Old Processor 
Status Register. 

5. The Current Processor Status register is modi- 
fied as shown in Figure 55 (the value "u" means 
unaffected). Note that setting the Freeze (FZ) bit 
freezes the Channel Address, Channel Data, 
Channel Control, Program Counter 0, Program 
Counter 1 , Program Counter 2, and ALU Status 
Registers. 

6. The address of the first instmction of the inter- 
mpt or trap handler is determined. If the VF bit of 



the Configuration Register is 1, the address is 
obtained by accessing a vector from instmction/ 
data menrwry, using the physical address ob- 
tained from the Vector Area Base Address Reg- 
ister and the vector number. This access ap- 
pears on the channel as a data access, and the 
OPT2-OPTo signals indicate a word-length ac- 
cess. If the VF bit is 0, the instmction address is 
given directly by the Vector Area Base Address 
Register and the vector number. 

7. If the VF bit is 1 , the R bit in the vector fetched in 
Step 6 is copied into the RE bit of the Current 
Processor Status Register. If the VF bit is 0, the 
RV bit of the Configuration Register is copied 
into the RE bit. This step determines whether or 
not the first instmction of the intermpt handler is 
in instmction ROM. 

8. An instmction fetch is initiated using the instmc- 
tion address determined in Step 6. At this point, 
normal instmction execution resumes. 

Note that the processor does not explicitly save the con- 
tents of any registers when an intermpt is taken. If regis- 
ter saving is required, it is the responsibility of the inter- 
mpt or trap-handling routine. For proper operation, reg- 
isters must be saved before any further intermpls or 
traps may be taken. The FZ bit must be reset at least two 
instmctions before intermpts or traps are reenabled to 
allow the program state to be reflected properly in pro- 
cessor registers if an intermpt or trap is taken. 

Returning from an Interrupt or Trap 

Two instmctions are used to resume the execution of an 
intermpted program: Intermpt Return (IRET), and Inter- 
mpt Return and Invalidate (IRETINV). These instmc- 
tions are identical except in one respect: the IRETINV 
instmction resets all Valid bits in the Branch Target 
Cache, whereas the IRET instruction does not affect the 
Valid bits. 

In some situations, the processor state must be set 
properly by software before the intermpt return is exe- 
cuted. The following is a list of operations normally per- 
formed in such cases: 

1 . The Current Processor Status is configured as 
shown in Figure 55 (the value "x" is a "don't 
care"). Note that setting the FZ bit freezes the 
registers listed below so that they may be set for 
the intermpt return. 
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Number 


Type of Trap or Interrupt 


Cause 





Illegal Opcode 


executing undefined instruction 


1 


Unaligned Access 


access on unnatural boundary, TU = 1 


2 


Out of Range 


overflow or underflow 


3 


Coprocessor Not Present 


coprocessor access, CP = 


4 


Coprocessor Exception 


coprocessor DERR response 


5 


Protection Violation 


invalid User-mode operation 


6 


Instruction Access Exception 


lERR response 


7 


Data Access Exception 


DERR response, not coprocessor 


8 


User-Mode Instruction TLB Miss 


no TLB entry for translation 


9 


User-Mode Data TLB Miss 


" 


10 


Supervisor-Mode Instruction TLB Miss 


" 


11 


Supervisor-Mode Data TLB Miss 


ft 


12 


Instruction TLB Protection Violation 


TLBUE/SE = 


13 


Data TLB Protection Violation 


TLB UR/SR = 0, UW/SW = on write 


14 


Timer 


Timer Facility 


15 


Trace 


Trace Facility 


16 


INTH, 


INTRo input 


17 


INTR, 


INIH, input 


18 


INTR^ 


INTRj input 


19 


INTR3 


INTR3 input 


20 


TRAP, 


TRAPo input 


21 


TRAP, 


TRAP, input 


22 


Floating-Point Exception 


unmasked floating-point exception 


23 


reserved 




24-29 


reserved for instruction emulation 
(op codes D8-DD) 




30 


MULTM 


MULTM instruction 


31 


MULTMU 


MULTMU instruction 


32 


MULTIPLY 


MULTIPLY instruction 


33 


DIVIDE 


DIVIDE instruction 


34 


MULTIPLU 


MULTIPLU instruction 


35 


DIVIDU 


DIVIDU instruction 


36 


CONVERT 


CONVERT instruction 


37 


SORT 


SORT instruction 


38 


CLASS 


CLASS instruction 


39-41 


reserved for instruction emulation 
(op codes E7-E9) 




42 


FEQ 


FEQ instruction 


43 


DEO 


DEQ instruction 


44 


FGT 


FGT instruction 


45 


DGT 


DGT instruction 


46 


FGE 


FGE instruction 


47 


DGE 


DGE instruction 


48 


FADD 


FADD instruction 


49 


DADD 


DADD instruction 


50 


FSUB 


FSUB instruction 


51 


DSUB 


DSUB instruction 


52 


FMUL 


FMUL instruction 


53 


DMUL 


DMUL instruction 


54 


FDIV 


FDIV instruction 


55 


DDIV 


DDIV instruction 


56 


reserved for instruction emulation 
(op code F8) 




57 


FDMUL 


FDMUL instruction 


58-63 


reserved for instruction emulation 
(op codes FA-FF) 




64-255 


Assert and EMULATE instruction traps 
(vector number specified by instruction) 





Figure 54. Vector Number Assignments 
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Figure 55. Current Processor Status after an interrupt or Trap 



2. The Old Processor Status is set to the value of 
the Current Processor Status for the target 
routine. 

3. The Channel Address, Channel Data, and 
Channel Control registers are set to restart or re- 
sume uncompleted channel operations of the 
target routine. 

4. The Program Counter 1 and Program Counter 
registers are set to the addresses of the first and 
second instructions, respectively, to be exe- 
cuted in the target routine. 

5. Other registers are set as required. These may 
include registers such as the ALU Status, Q, and 
so forth, depending on the particular situation. 
Some of these registers are unaffected by the 
FZ bit, so they must be set in such a manner that 
they are not modified unintentionally before the 
interrupt return. 

Once the processor registers are configured properly, 
as described above, an intermpt return instruction 
(IRETor IRETINV) performs the remaining steps neces- 
sary to return to the target routine. The following opera- 
tions are performed by the interrupt return instruction: 

1 . Any in-progress load or store operation is com- 
pleted. If a Load Multiple or Store Multiple se- 
quence is in progress, the interrupt return is not 
executed until the sequence is completed. 

2. Interrupts and traps are disabled, regardless of 
the settings of the DA, Dl, and IM fields of the 



Current Processor Status, for Steps 3 through 
10. 

3. If the interrupt return instruction is an IRETINV, 
all Valid bits in the Branch Target Cache are 
reset. 

4. Thecontentsof the Old Processor Status Regis- 
ter are copied into the Current Processor Status 
Register. This normally resets the FZ bit allow- 
ing the Program Counter 0, 1,2, Channel Ad- 
dress, Data, Control, and ALU Status registers 
to update normally. Since certain bits of the Cur- 
rent Processor Status Register always are up- 
dated by the processor, this copy operation may 
be irrelevant for certain bits (e.g., the Intermpt 
Pending bit). 

5. If the Contents Valid (CV) bit of the Channel 
Control Register is 1 , and the Not Needed (NN) 
and Multiple Operation (ML) bits are t)oth 0, an 
external access is started. This operation is 
based on the contents of the Channel Address, 
Channel Data, and Channel Control registers. 
The Current Processor Status Register condi- 
tions the access— as is normally the case. Note 
that Load Multiple and Store Multiple operations 
are not restarted at this point. 

6. The address in Program Counter 1 is used to 
fetch an instruction. The Current Processor 
Status Registerconditionsthefetch.Thisstepis 
treated as a branch in the sense that the proces- 
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Figure 56. Current Processor Status Before Interrupt Return 
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sor searches the Branch Target Cache for the 
target ol the fetch. 

7. The instruction fetched in Step 6 enters the de- 
code stage of the pipeline. 

8. The address in Program Counter is used to 
fetch an instruction. The Current Processor 
Status Register conditions the fetch. This step is 
treated as a branch in the sense that the proces- 
sor searches the Branch Target Cache for the 
target of the fetch. 

9. The instruction fetched in Step 6 enters the exe- 
cute stage of the pipeline, and the instruction 
fetched in Step 8 enters the decode stage. 

10. . If the CV bit in the Channel Control Register is a 
1 . the NN bit is 0, and the ML bit is 1 , a Load Mul- 
tiple or Store Multiple sequence is started, 
based on the contents of the Channel Address, 
Channel Data, and Channel Control registers. 

11. Interrupts and traps are enabled per the ap- 
propriate bits in the Current Processor Status 
Register. 

12. The processor resumes normal operation. 

Fast Interrupt Processing 

The registers affected by the FZ bit of the Current Pro- 
cessor Status Register are those that are modified by al- 
most any usual sequence of instructions. Since the FZ 
bit is set by an inteoupt or trap, the interrupt or trap han- 
dler is able to execute while not disturbing the state of 
the interrupted routine, though its execution is some- 
what restricted. Thus, it is not necessary in many cases 
for the intermpt or trap handler to save the registers that 
are affected by the FZ bit. 

The processor provides an additional benefit if the Pro- 
gram Counter and Program Counter 1 registers are 
not modified by the inten'upt or trap handler. If Program 
Counters and 1 contain the addresses of sequential in- 
structions when an interrupt or trap is taken, and if they 
are not modified before an interrupt retum is executed, 
Step 8 of the intenupt return sequence above occurs as 
a sequential fetch — instead of a branch— for the inter- 
rupt return. The performance impact of a sequential 
fetch is normally less than that of a no nsequential fetch. 

Because the registers affected by the FZ bit are some- 
times required for instruction execution, it is not possible 
for the intenupt or trap handler to execute all instruc- 
tions unless the required registers are first saved else- 
where (e.g., in one or more global registers). Most of the 
restrictions due to register dependencies are obvious 
(e.g., the Byte Pointer for byte extracts), and will not be 
discussed here. Other less obvious restrictions are 
listed below: 

1. Load Multiple and Store Multiple. The Channel 
Address, Channel Data, and Channel Control 
registers are used to sequence Load Multiple 
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and Store Multiple operations, so these instruc- 
tions cannot be executed while the registers are 
frozen. However, note that other extemal 
accesses may occur; the Channel Address, 
Channel Data, and Channel Control registers 
are required only to restart an access after an 
exception, and the interrupt ortrap handler is not 
expected to encounter any exceptions. 

2. Loads and stores that set the Byte Pointer. If the 
Set Byte Pointer (SB) of a load or store instruc- 
tion is 1 and the FZ bit is also 1 , there is no effect 
on the Byte Pointer. Thus, the execution of ex- 
ternal byte and half-word accesses using this 
mechanism is not possible. 

3. Extended arithmetic. The Cany bit of the ALU 
Status Register is not updated while the FZ bit 
isl. 

4. Divide step instructions. The Divide Flag of the 
ALU Status Register is not updated when the FZ 
bit is 1. 

If the interrupt or trap handler does not save the state of 
the interrupted routine, it cannot allow additional inter- 
rupts and traps. Also, the operation of the intermpt or 
trap handler cannot depend on any trapping instruc- 
tions (e.g., Floating-Point instructions, illegal operation 
codes, arithmetic overflow, etc.) since these are dis- 
abled. There are certain cases, however, where traps 
are unavoidable; these are discussed in the Arithmetic 
Exceptions section. 



differences between the WARN trap and 



WARN Trap 

The processor re cognize s a special trap, caused by the 
acti vation o f the WARN input, that cannot be masked. 
The WARN trap is intended to be used for severe sys- 
tem-en-or or deadlock conditions. It allows the processor 
to be placed in a known, operable state, while preserv- 
ing much of its original state for error reporting and pos- 
sible recovery. Therefore, it shares some features in 
common with the Reset mode as well as features com- 
mon to other traps described in this section. 

The major 
other traps are 

1 . The processor does not wait for an in-progress 
external access to be completed before taking 
the trap, since this access might not be com- 
pleted. However, the infonnation related to any 
outstanding access is retained by the Channel 
Address, Channel Data, and Channel Control 
registers when the trap is taken. 

2. The vector-fetch operation is not perfonned, re- 
gardless of the_yFbil[of the Configuration Regis- 
ter, when the WARN trap is taken. Instead, the 
ROM Enable (RE) bit in the Current Processor 
Status is set, and instruction fetching begins im- 
mediately at Address 1 6 in the instmction ROM. 
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The trap handler executes directly from the in- 
struction ROM without the need to access 
external (and possibly nonfunctional or invalid) 
instaiction/data memory. 



Note that WARN trap may disrupt the state of the routine 
that is executing when it is taken, prohibiting this routine 
from being restarted. 

Sequencing of interrupts and Traps 

On every cycle, the processor decides eitherto execute 
instructions or to take an Interrupt or trap. Since there 
are multiple sources of interrupts and traps, more than 
one intenupt or trap may be pending on a given cycle. 

To resolve conflicts, interrupts and traps are taken ac- 
cording to the priority shown in Figure 57. In this table, 
interrupts and traps are listed in order of decreasing pri- 
ority. This section discusses the first three columns of 
Figure 57. The last two columns are discussed in the 
Exception Reporting and Restarting section. 

In Figure 57, intermpts and traps fall into one of two 
categories depending on the timing of their occurrence 
relative to instmction execution. These categories are 
indicated in the third column by the labels "Inst" and 
"async." These labels have the following meanings: 

1 . Inst— Generated by the execution or attempted 
execution of an instruction. 

2. Async— Generated asynchronous to and inde- 
pendent of the instmction being executed, al- 
though it may be a result of an instmction exe- 
cuted previously. 

The principle for intermpt and trap sequencing is that the 
highest priority intenupt or trap is taken first. Other 
interrupts and traps remain active until they can be 
taken, or are regenerated when they can be taken. This 
is accomplished, depending on the type of intermpt or 
trap, as follows: 

1 . All traps in Figure 57 with Priority 1 3 or 1 4 are re- 
generated by the re-execution of the causing in- 
struction. 

2. Most of the interrupts and traps of Priorities 4 
through 12 must be held by external hardware 
until they are taken. The exceptions to this are 
listed in (3) below. 

3. The exceptions to (2) above are the Data Access 
Exception trap, the Coprocessor Exception trap, 
the Timer intermpt, and the Trace trap. These 
are caused by bits in various registers in the 
processor and are held by these registers until 
taken or cleared. The relevant bits are: the 
Transaction Faulted (TF) bit of the Channel Con- 
trol Register for Data Access Exception and 
Coprocessor Exception traps, the Intermpt (IN) 
bit of the Timer Reload Register for Timer inter- 



mpts, and the Trace Pending (TP) bit of the Cur- 
rent Processor Status Register for Trace traps. 

4. All traps of Priorities 2 and 3 in Figure 57, except 
for the Unaligned Access trap, are not regener- 
ated. These traps are mutually exclusive and are 
given high priority because they cannot be re- 
generated; they must be taken if they occur. If 
one of t hese tra ps occurs at the same time as a 
reset or WARN trap, it is not taken, and its occur- 
rence is lost. 

5. The Unaligned Access trap is regenerated inter- 
nally when an extemal access is restarted by the 
Channel Address, Channel Data, and Channel 
Control registers. Note that this trap is not nec- 
essarily exclusive to the traps discussed in (4) 
at>ove. 

Note that the Channel Address, C hannel Data, and 
Channel Control registers are set fora WARN trap only if 
an external access is in progress when the trap is taken. 

Exception Reporting and Restarting 

When an instmction encounters an exceptional condi- 
tion, the Program Counter 0, Program Counter 1, and 
Program Counter 2 registers report the relevant instmc- 
tion address(es), and allow the instmction sequence to 
be restarted once the exceptional condition has been 
remedied (if possible). Similarly, when an external ac- 
cess or coprocessor transfer encounters an exceptional 
condition, the Channel Address, Channel Data, and 
Channel Control registers report infomiation on the ac- 
cess or transfer, and allow it to be restarted. This section 
describes the interpretation and use of these registers. 

The "PCr column in Figure 57 describes the value held 
in the Program Counter 1 Register (PC1 ) when the inter- 
mpt or trap is taken. For traps in the "inst" category, PC1 
contains either the address of the instmction causing 
the trap, indicated by "curr," or the address of the in- 
stmction following the instmction causing the trap, indi- 
cated by "next." 

For interrupts and traps in the "async" category, PC1 
contains the address of the first instmction, which was 
not executed due to the taking of the interrupt or trap. 
This is the next instmction to be executed upon intermpt 
return, as indicated by "next" in the PC1 column. 

Instruction Exceptions 

For traps caused by the execution of an instmction (e.g., 
the Out of Range trap), the Program Counter 2 Register 
contains the address of the instmction causing the trap. 
In all of these cases, PC1 is in the "next" category. The 
Exception Opcode Register contains the operation code 
of the instmction causing the trap. 

The traps associated with instmction fetches (i.e., those 
of Priority 13) occur only if the processor attempts the 
execution of the associated instmction. An exception 
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Priority 


Type Of Interrupt Or Trap 


Inst/Async 


PC1 


Channel Regs 


1 
(highest) 




async 


next 


see Note 1 


WARN 


2 


User-Mode Data TLB Miss 
Supervisor-Mode Data TLB Miss 
Data TLB Protection Violation 


inst 
inst 
inst 


next 
next 
next 


all 
all 
all 


3 


Unaligned Access 

Coprocessor not Present 

Out of Range 

Floating-Point Exceptions 

Assert Instructions 

Floating-Point Instructions 

MULTIPLY 

MULTM 

DIVIDE 

MULTIPLU 

MULTMU 

DIVIDU 

EMULATE 


inst 
inst 
inst 
inst 
inst 
inst 
inst 
inst 
inst 
inst 
inst 
inst 
inst 


next 
next 
next 
next 
next 
next 
next 
next 
next 
next 
next 
next 
next 


all 

all 

N/A 

N/A 

N/A 

N/A 

N/A 

N/A 

N/A 

N/A 

N/A 

N/A 

N/A 


4 


Data Access Exception 
Coprocessor Exception 


async 
async 


next 
next 


all 
all 


5 




async 


next 


multiple 


TRAPo 


6 


tRAP, 


async 


next 


multiple 


7 


INTH, 


async 


next 


multiple 


8 


INTR, 


async 


next 


multiple 


9 


INTR^ 


async 


next 


multiple 


10 


INTFJ, 


async 


next 


multiple 


11 


Timer 


async 


next 


multiple 


12 


Trace 


async 


next 


multiple 


13 


User-Mode Instruction TLB Miss 
Supervisor-Mode Instr. TLB Miss 
Instruction TLB Protection Violation 
Instruction Access Violation 


inst 
inst 
inst 
inst 


curr 
curr 
curr 
curr 


N/A 
N/A 
N/A 
N/A 


14 
(lowest) 


Illegal Opcode 
Protection Violation 


inst 
inst 


curr 
curr 


N/A 
N/A 



Note: The Channel Address, Channel Data, and Channel Control registers are set for a WARN trap 
only if an external access is in progress when the trap is taken. 



Figure 57. Interrupt and Trap Priority Table 



may be detected during an instruction prefetch, but the 
associated trap does not occur if a nonsequential fetch 
occurs before the processor attempts the execution of 
the invalid instruction. This prevents the spurious indica- 
tion of instruction exceptions. 



Data Exceptions 

The "Channel Regs" column of Figure 57 indicates the 
cases for which the Channel Address, Channel Data, 
and Channel Control registers contain information re- 
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lated to an external access or coprocessor transfer 
(these registers collectively are termed "channel regis- 
ters" in the following discussion). For the cases indi- 
cated, the access or transfer was not completed be- 
cause of some exceptional condition. Note that the 
Channel Data Register contains relevant information 
only in the case of a store. 



Forthe WARN trap, the channel registers are valid only if 
a load or store were in prog ress when the trap was 
taken. Recall that the WARN trap does not wait for any 
in-progress access to be completed. 

For the traps with an "all" in the "Channel Regs" column 
of Figure 57, the channel registers contain information 
relevant to the trap in all cases. These traps are associ- 
ated with exceptional events during external accesses 
or coprocessor transfers. 

For the traps with a "multiple" in the "Channel Regs" col- 
umn, the channel registers might contain information for 
restarting an interrupted Load Multiple or Store Multiple 
operation. In these cases, the operation did not encoun- 
ter an exception, but was simply canceled for latency 
considerations. 

The information contained in the channel registers al- 
lows the processor to restart the related operation dur- 
ing an interrupt return sequence, without any special as- 
sistance by software. Software must only ensure that 
the relevant information is retained in, or restored to, the 
channel registers before an interrupt return is executed. 

Arithmetic Exceptions 

Integer and floating-point instructions can cause Out of 
Range or Floating-Point Exception traps, respectively, if 
an exception is detected during the arithmetic operation. 
This section describes the conditions under which these 
traps occur and the additional operations performed be- 
yond those described in the Interrupt and Trap Handling 
section. 

integer Exceptions 

Some integer add and subtract instructions — ADDS, 
ADDU, ADDCS, ADDCU, SUBS, SUBU, SUBCS, 
SUBCU, SUBRS, SUBRU, SUBRCS, and SUBRCU— 
cause an Out of Range trap upon overflow or underflow 
of a 32-bit signed or unsigned result, depending on the 
instruction. 

Two integer multiply instructions — MULTIPLY and 
MULTIPLU — cause an Out of Range trap upon overflow 
of a 32-bit signed or unsigned result, respectively, if the 
MO bit of the Integer Environment Register is 0. If the 
MO bit is 1 , these multiply instructions cannot cause an 
Out of Range trap. 

Two integer divide instructions— DIVIDE and DIVIDU — 
take the Out of Range trap upon overflow of a 32-bit 
signed or unsigned result, respectively, if the DO bit of 
the Integer Environment Register is 0. If the DO bit is 1 , 
the divide instructions cannot cause an Out of Range 



trap unless the divisor is 0. If the divisor is 0, an Out of 
Range trap always occurs, regardless of the DO bit. 

In addition to the operations described in the Inten-upt 
and Trap Handling section, the following operations are 
performed when an Out of Range trap is taken: 

1. The operation code of the instmction causing the 
exception is placed in the lOP field of the Excep- 
tion Opcode Register. 

2. For the MULTIPLY, MULTIPLU, DIVIDE, and 
DIVIDU instructions, the absolute register num- 
bers of the excepting instmction's source and 
destination registers are placed into the Indirect 
Pointer A, Indirect Pointer B, and Indirect Pointer 
C registers. 

3. For the MULTIPLY, MULTIPLU, DIVIDE, and 
DIVIDU instructions, the destination register or 
registers are unchanged. 

Floating-Point Exceptions 

A Floating-Point Exception trap occurs when an excep- 
tion is detected during a floating-point operation, and the 
exception is not masked by the corresponding bit of the 
Floating-Point Mask Register. In this context, a floating- 
point operation is defined as any operation that accepts 
a floating-point number as a source operand, that pro- 
duces a floating-point result, or both. Thus, for example, 
the CONVERT instruction may create an exception 
while attempting to convert a floating-point value to an 
integer value. 

In addition to the operations described in the Interrupt 
and Trap Handling section, the following operations are 
performed when a Floating-Point Exception trap is 
taken: 

1. The operation code of the instruction causing the 
exception is placed in the lOP field of the Excep- 
tion Opcode Register. 

2. The status of the trapping operation is written 
into the trap status bits of the Floating-Point 
Status Register. The status bits that are written 
do not depend on the values of the correspond- 
ing mask bits in the Floating-Point Environment 
Register. 

3. The absolute register numbers of the excepting 
instruction's source and destination registers 
are placed into the Indirect Pointer A, Indirect 
Pointer B, and Indirect Pointer C registers. If the 
RB or RC fields specify a function code, that 
code is transferred to the corresponding indirect 
pointer. Note that if the most-significant bit of the 
this function code is 1 , the value of the Stack 
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Pointer has been added to the RB field and nnust 
be subtracted to recover the original field. 

4. The destination register or registers are left un- 
changed. 

Exceptions During Interrupt 
and Trap Handling 

In nrx)st cases, interoipt and trap handling routines are 
executed with the DA bit in the Current Processor Status 
having a value of 1 . It is assumed that these routines do 
not create many of the exceptions possible in most other 
processor routines, so most of these are ignored. 

If the assumption of no exceptions is not valid for a par- 
ticular interrupt or trap handler, it is important that the 
handler save the state of the processor and reset the FZ 
bit of the Current Processor Status, so that the handler 
itself may be restarted properly. This must be accom- 
plished before any intermpts or traps can be taken. In 
this case, the state (or the state of some other process) 
must be restored before an interrupt return is executed. 



It is possible that errors reported via the I ERR and DERR 
signals are associated with hardware errors, indepen- 
dent of any routine being executed. For this reason, the 
Instruction Access Exception, Data Access Exception, 
and Coprocessor Exception traps cannot be disabled by 
the DA bit, and the processor may take one of these 
traps even while handling another interrupt or trap. 

If the processor does take an unmaskable trap while 
handling another internjpt or trap, and the state of the 
intermptortrap handler is not reflected in processor reg- 
isters, it is not possible to return to the point at which the 
unmaskable trap is taken. When the unmaskable trap is 
taken, the processor state saved is that state associated 
with the original interrupt or trap, not with the unmask- 
able trap; however, the Old Processor Status Register is 
modified to reflect the Current Processor Status Regis- 
ter of the interrupt or trap handler. This situation, indi- 
cated by the DA bit being 1 in the Old Processor Status 
Register, may not be recoverable. 
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MEMORY MANAGEMENT 

The Am29000 incorporates a Merrwry Management 
Unit (I^MU) for performing virtuai-to-piiysicai address 
transiation and memory access protection. Tiiis section 
describes the iogical operation of the Memory Manage- 
ment Unit. 

Address translation can be performed oniy for instoic- 
tion/data memory accesses. No address translation is 
performed for instruction ROM, input/output, coproces- 
sor, or interrupt/trap vector accesses. However, an in- 
struction/data memory access can be redirected to in- 
put/output by the address-translation process. 

Translation Look-Aside Buffer 

The MMU stores the most recently performed address 
translations in a special cache, the Translation Look- 
Aside Buffer (TLB). All virtual addresses generated by 
the processor are translated by the TLB. Given a virtual 
address, the TLB determines the corresponding physi- 
cal address. 

The TLB reflects information in the processor system 
page tables, except that it specifies the transiation for 
many fewer pages; this restriction allows the TLB to be 



incorporated on the processor chip where the per- 
formance of address translation is maximized. 

A diagram of the TLB is shown in Figure 58. The TLB is a 
table of 64 entries, divided into two equal sets, called Set 
and Set 1 . Within each set, entries are numbered to 
31. Entries in different sets that have equivalent entry 
numbers are grouped into a unit called a line; there are 
thus 32 lines in the TLB, numbered to 31 . 

Each TLB entry is 64 bits long and contains mapping 
and protection information for a single virtual page. TLB 
entries may be inspected and modified by processor in- 
structions executed in the Supervisor mode. The layout 
of TLB entries is described in the Register Description 
section. 

The TLB stores information at)out the ownership of the 
TLB entries in an 8-bit Task Identifier (TID) field in each 
entry. This makes it possible for the TLB to be shared by 
several independent processes without the need for in- 
validation of the entire TLB as processes are activated. 
It also increases system performance by permitting 
processes to warm-start (i.e., to start execution on the 
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Figure 58. Translation Look-Aside Buffer Organization 
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processor with a certain number of TLB entries remain- 
ing in the TLB from a previous execution). 

Each TLB entry contains a Usage bit to assist manage- 
ment of the TLB entries. The Usage bit indicates which 
set of the entry within a given line was least recently 
used to perform an address translation. Usage bits for 
two entries in the same line are equivalent. 

The TLB contains otherf ields, described in the following 
sections. 

Address Translation 

For the purpose of address translation, the virtual 
instruction/data address space of a process is parti- 
tioned into regions of fixed size, called pages. Pages are 
mapped by the address-translation process into equiva- 
lent-sized regions of physical memory, called page 
frames. All accesses to instructions or data contained 
within a given page use the same virtual-to-physical 
address translation. 

Virtual addresses are partitioned into three fields forthe 
address-translation process, as shown in Figure 59. 
The partitioning of the virtual address is based on the 
page size. Page sizes may be of 1 , 2, 4, or 8 kb, as 
specified by the MMU Configuration Register. The fields 
shown in Figure 59 are described in the following 
discussion. 



^___ _^ Am29000 

Address Translation Controls 

The processor attempts to perform address translation 
for the following external accesses: 

1 . Instruction accesses, if the Physical Addressing/ 
Instmctions (PI) and ROM Enable (RE) bits of 
the Current Processor Status are both 0. 

2. User-mode accesses to instruction/data mem- 
ory if the Physical Addressing/Data (PD) bit of 
the Current Processor Status is 0. 

3. Supervisor-mode accesses to instruction/data 
memory if the Physical Address (PA) bit of the 
load or store instruction performing the access is 
0, and the PD bit of the Current Processor Status 
is 0. 

Address translation also is controlled by the MMU Con- 
figuration Register. This register specifies the virtual 
page size and contains an 8-bit Process Identifier (PID) 
field. The PI D field specifies the process number associ- 
ated with the currently running program, if this is a User- 
mode program. Supervisor-mode programs are as- 
signed a fixed process number of 0. The process num- 
ber is compared with Task Identifier (TID) fields of the 
TLB entries during address translation. The TID field of 
a TLB entry must match the process number for the 
translation to be valid. 



1-kb Page Size: 
31 



23 



15 
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Figure 59. Virtual Address for 1-, 2-, 4-, and 8-kb Pages 



1-85 



29K Family CMOS Devices 



Address Translation Process 

The address-translation process is diagrammed in 
Figure 60. Address translation is performed by the fol- 
lowing fields in the TLB entry: the Virtual Tag (VTAG), 
the Task Identifier (TID), the Valid Entry (VE) bit, the 
Real Page Number (RPN) field, and the Input/Output 
(10) bit. To perform an address translation, the proces- 
sor accesses the TLB line whose number is given by 
certain bits in the virtual address. The bits used depend 
on the page size as follows: 





Virtual Address Bits 


Page Size 


(for Line Access) 


Ikb 


14-10 


2kb 


15-11 


4kb 


16-12 


8kb 


17-13 



The accessed line contains two TLB entries, which in 
turn contain two VTAG fields. The VTAG fields are both 
compared to bits in the virtual address. This comparison 
depends on the page size as follows (note that VTAG 



bit-numbers are relative to the VTAG field, not the TLB 
entry): 

Page size Virtual Address Bits VTAG Bits 



Ikb 
2kb 
4kb 
8kb 



31-15 
31-16 
31-17 
31-18 



16-0 
16-1 
16-2 
16-3 



Certain bits of the VTAG field do not participate in the 
comparison forpage sizes largerthan 1 kb. These bits of 
the VTAG field are required to be 0. 

For an address translation to be valid, the following con- 
ditions must be met: 

1. The virtual address bits match corresponding 
bits of the VTAG field as specified above. 

2. For a User-mode access, the Tl D field in the TLB 
entry matches the PID field in the MMU Conf igu- 
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Figure 60. Address Translation Process 
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ration Register. For a Supervisor-mode access, 
the TID field is 0. 

3. The VE bit in the TLB entry is 1 . 

4. Only one entry in the line meets conditions 1,2, 
and 3 atx)ve. If this condition is not met, the re- 
sults of the translation may be treated as valid by 
the processor, but the results are unpredictable. 

If the address translation is valid for one TLB entry in the 
selected line, the RPN field in this entry is used to form 
the physical address of the access. The RPN field gives 
the portion of the physical address that depends on 
the translation; the remaining portion of the virtual ad- 
dress, called the Page Offset, is invariant with address 
translation. 

The Page Offset comprises the low-order bits of the vir- 
tual address, and gives the location of a byte (because 
of byte addressing) within the virtual page. This byte is 
located at the same position in the physical page frame, 
so the Page Offset also comprises the low-order bits of 
the physical address. 

The 32-bit physical address is the concatenation of cer- 
tain bits of the RPN field and Page Offset, where the bits 
from each depend on the page size as follows (note that 
RPN bit numbers are relative to the RPN field, not the 
TLB entry): 







Virtual Address Bits 


Page Size 


RPN Bits 


for Page Offset 


1kb 


21-0 


9-0 


2kb 


21-1 


10-0 


4kb 


21-2 


11-0 


8kb 


21-3 


12-0 



Note that certain bits of the RPN field are not used in 
forming the physical address for page sizes greater than 
1 kb. These bits of the RPN are required to be 0. In addi- 
tion, for certain instruction accesses, the Page Offset is 
incremented by 16. 

The address space of the physical address is deter- 
mined by the Input/Output (10) bit of the TLB entry. If the 
10 bit is 0, the address is in the instruction/data memory 
address space. If the 10 bit is 1 , the address is in the in- 
put/output address space. 

Successfui and Unsuccessful Translations 

If an address translation is successful, the TLB entry is 
further used to perform protection checking for the ac- 
cess. Bits in the TLB make it possible to restrict ac- 
cesses— independently for Supervisor-mode and User- 
mode accesses— to any combination of load, store, and 
instruction accesses, or to no access. 

If the address translation is valid and no protection viola- 
tion is detected, the physical address from the transla- 
tion is placed on the processor's address bus and the 
access is initiated. If the translation is not valid or a pro- 
tection violation is detected, a trap occurs. Depending 



Am29000 

on the state of the channel interface, the access re quest 
may be placed on the address bus with the signal BINV 
asserted, even though the trap occurs. 

Also, if the address translation is successful and there is 
no protection violation, the PGM bits from the TLB entry 
used for translation are placed on the MPGMi-MPGMo 
outputs during the address cycle for the access. If ad- 
dress translation is not perfonned, these pins are tx)th 
Low for the address cycle. 

If the TLB cannot translate an address, a TLB miss oc- 
curs. The Mf\/IU causes a trap if either a TLB miss oc- 
curs, or the translation is successful and a protection 
violation is detected. The processor distinguishes be- 
tween traps caused by instoiction and data accesses, 
and between traps caused by User and Supervisor- 
mode accesses, as follows: 



Trap Vector 
Number 



Type of Trap 



8 


User-Mode Instruction TLB Miss 


9 


User-Mode Data TLB Miss 


10 


Supervisor-Mode Instruction 




TLB Miss 


11 


Supervisor-Mode Data TL Miss 


12 


Instruction TLB Protection 




Violation 


13 


Data TLB Protection Violation 



The distinction between the above traps is made to 
assist trap handling, particularly the routines that load 
TLB entries. 

Reload 

So that the MM U may support a large variety of memory- 
management architectures, it does not directly load TLB 
entries that are required for address translation. It sim- 
ply causes a TLB miss trap when address translation is 
unsuccessful. The trap causes a program — called the 
TLB reload routine— to execute. The TLB reload routine 
is defined according to the structure and access method 
of the page table contained in an external device or 
memory. 

When a TLB miss trap occurs, the LRU Recommenda- 
tion Register is written with the TLB register number for 
Word of the TLB entry to be used by the TLB reload 
routine. For instruction accesses, the Program Counter 
1 Register contains the instmction address that was not 
successfully translated. Fordata accesses, the Channel 
Address Register contains the data address that was 
not successfully translated. 

The TLB reload routine determines the translation for 
the address given by the Program Counter 1 Register or 
Channel Address Register, as appropriate. The TLB 
reload routine uses an external page table to determine 
the required translation, and loads the TLB entry indi- 
cated by the LRU Recommendation Register so that the 
entry may perform this translation. In a demand-paged 



1-87 



29K Family CMOS Devices 



environment, the TLB reload routine may additionally in- 
voke a page-fault handler when the translation cannot 
be performed. 

TLB entries are written by the Move To TLB (MTTLB) 
instruction, which copies the contents of a general- 
purpose register into a TLB register. The TLB register 
number is specified by bits 6-0 of a general-purpose 
register. TLB entries are read by the Move From 
TLB (MFTLB) instruction, which copies the contents of 
a TLB register into a general-purpose register. Again, 
the TLB register number is specified by a general- 
purpose register. 

Entry Invalidation 

There are two methods for invalidating TLB entries that 
are no longer required at a given point in program exe- 
cution. The first involves resetting the Valid Entry bit of a 
single entry (this is done by a Move To TLB instruction). 
The second involves changing the value of the Process 
Identifier (PID) field of the MMU Configuration Register; 
this invalidates all entries whose Task Identifier (TID) 
fields do not match the new value. 

If an entry is invalidated by changing the PID field, the 
TLB entry still remains valid in some sense. If the PID 
field is changed again to match the TID field, the entry 
may once again participate in address translation. This 
ability can be used to reduce the number of TLB misses 



in a system during process switching. However, it is im- 
portant to manage TLB entries so that an invalid match 
cannot occur between the PID field and the TID field of 
an old TLB entry. 

Protection 

if an address translation is performed successfully, the 
TLB entry used in address translation is used to perform 
protection checking for the access. There are 6 bits in 
the TLB entry for this purpose: Supervisor Read (SR), 
Supervisor Write (SW), Supervisor Execute (SE), User 
Read (UR), User Write (UW), and User Execute (UE). 
These bits restrict accesses, depending on the program 
mode of the access, as shown in Figure 61 (the value "x" 
is a "don't care"). 

Note that for the Load and Set (LOADSET) instruction, 
the protection bits must be set to allow both the load and 
store access. If this condition does not hold, neither ac- 
cess is performed. 

If protection checking indicates that a given access is 
not allowed, a Data TLB Protection Violation or Instruc- 
tion TLB Protection Violation trap occurs. The cause of 
the trap is determined by Inspection of the Program 
Counter 1 Register for an Instruction TLB Protection 
Violation, or by inspection of the contents of the Channel 
Address and Channel Control registers for a Data TLB 
Protection Violation. 
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Figure 61, TLB Access Protection 
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CHANNEL DESCRIPTION 

The processor channel provides the bandwidth required 
for performance, while permitting the connection of 
many different types of devices. This section describes 
the channel and methods of connecting devices and 
memories to the processor. 

The channel consists of three 32-bit synchronous buses 
with associated control and status signals: the Address 
Bus, Data Bus, and Instruction Bus. The Address Bus 
transfers addresses and control information to devices 
and menrraries. The Data Bus transfers data to and from 
devices and memories. The Instruction Bus transfers in- 
structions to the processor from instruction menxsries. 
In addition, a set of signals allows control of the channel 
to be relinquished to an external master. 

There are five logical groups of signals perlorming five 
distinct functions, as follows (since some signals per- 
form more than one function, a signal may appear in 
more than one group): 

1 . Instruction Address Transfer and Instruction Ac- 
cess R equests: A3 1-A0. S UP/US, M PGMi- 
MPGMo, PEN, IREQ, IREQT, PJA, BINV 



2. Instruction Transfer: hi-lo, IBREQ, IRDY, lERR, 
IBACK 

3. Data Address Transfer a nd Data Ac cess Re- 
quests: A31-A0, R/ W, SUP /US, LOCK, MPGMi- 
MPGMo, P EN, DREQ , DREQT1-DREQT0, 
OPT?-OPTo, PDA, BINV 



4. Data Transfer : D31-D0, DBREQ, DRDY, DERR, 
DBACK, CDA 



5. Arbitration: BREQ, BGRT, BINV 

User-Defined Signals 

There are two types of user-defined outputs on the pro- 
cessorto control devices and memories directly in a sys- 
tem-dependent manner. Each of these outputs is valid 
simultaneously witfi — and for the same duration as — 
the address for an access. 

The first set of user-defined signals, MPGM1-MPGM0, 
is determined by the PGM bits in the Translation Look- 
Aside Buffer entry used in address translation. If ad- 
dress translation is not performed, these outputs are 
both Low. 

The second set of signals, OPTz-OPTo, is determined 
by bits 1 8-1 6 of the load or store instruction that initiates 
an access. These signals are valid only for data ac- 
cesses, and have a predefined interpretation for 
coprocessor data transfers. 

Standard interpretations of OPT2-OPT0 are given in the 
Pin Description section. Since the OPT2-OPT0 signals 
are determined by instructions, they have an impact on 
application-software compatibility, and system hard- 
ware should use the given definitions of OPT2-OPTo. 



Am29000 

The OPT2-OPT0 signals are used to encode byte and 
half-word accesses. However, for a load, the system 
should return an entire aligned word, regardless of the 
indicated data width. 

Note that the standard interpretations of OPT2-OPT0 
apply only to accesses to instruction/data memory and 
input/output. Other interpretations may be used for 
coprocessor transfers. 

For internjpt and trap vector fetches, the MPGMt- 
MPGMo and OPT?-OPTo outputs are all Low. 

Instruction Accesses 

Instruction accesses occur to one of two address 
spaces: instruction/data memory and instmction read- 
only memory (instruction ROfvl). The distinction be- 
tween these address spaces is made by the IREQT sig- 
nal, which is in turn derived from the ROM Enable (RE) 
bit of the Current Processor Status Register. These are 
tmly distinct address spaces; each may be populated in- 
dependently based on the needs of a particular system. 

Instruction/data memory contains both instructions 
and data. Although the channel supports separate 
instruction and data memories, the Memory Manage- 
ment Unit does not. In certain systems, it may be re- 
quired to access instructions via loads and stores, even 
though instructions may be contained in physically 
separate memories. For example, this requirement 
might be imposed because of the need to load instruc- 
tions into memory. Note also that the OPT2-OPTo sig- 
nals may be used to allow the access of instructions in 
instruction ROM, using loads; the Am29000 does not 
prevent a store to the instruction ROM, and protection 
against stores to the instruction ROM must be provided 
externally, if required. 

All processor instruction fetches are read accesses, and 
the R/W signal is High for all instruction fetches. 

Data Accesses 

Data accesses occur to one of three address spaces: 
instnjction/data memory, input/output (I/O), and the 
coprocessor. The distinction between these spaces is 
made by the DREQT1-DREQT0 signals, which are in 
turn determined by the load or store instmction that initi- 
ates a data access. Each of these address spaces is dis- 
tinct from the others. 

The protocol for data transfers to and from the coproces- 
sor is slightly different than the protocol for instruction/ 
data memory and I/O accesses. 

Data accesses may occur either from a slave device or 
memory to the processor (for a load), or from the pro- 
cessor to a slave device or memory (for a store). The di- 
rection of transfer is determined by the R/W signal. In 
the case of a load, the processor requires that data on 
the data bus be held valid only for a short time before the 
end of a cycle. In the case Of a store, the processor 
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drives the data bus as soon as the bus is available and 
holds the data valid until the slave device or memory sig- 
nals that the access is complete. 

Reporting Errors 

The successful completion of a n instru ction access is in- 
dicated by an active level on the I RDY input, and the suc- 
cessful completio n of a d ata access is indicated by an 
active level on the DRDY input. If there are exceptional 
conditions for which an instruction or data access can- 
not be completed successfully, the unsuccess ful co m- 
pletion is indicated by an active level on the I ERR or 
DERR input, as appropriate. 

If the processor receives an I ERR or DERR in response 
to an instruction or data access, it ignores the c ontent of 
the ins tru ction o r data bus and the value of IRDY or 
DRDY. An lERR response causes an Instruction Access 
Exception trap, unless it is associated with an instruction 
that the processor does not ultimately e xecute (because 
of a nonsequential instruction fetch). A DERR response 
always causes either a Data Access Exception trap or a 
Co-processor Exception Trap. 

The processor supports the restarting of unsuccessful 
accesses upon an interrupt return. In the case of an un- 
successful instruction access, the restart is performed 
by the Program Counter and Program Counter 1 regis- 
ters. In the case of an unsuccessful data access, the re- 
start is performed by the Channel Address, Channel 
Data, and Channel Control registers. In any event, the 
control program must determine whether or not an ac- 
cess can and/or should be restarted. 

The Instruction Access Exception and Data Access Ex- 
ception traps cannot be masked, if one of these traps 
occurs within an interrupt or trap handler, the processor 
state may not be recoverable. 

Access Protocols 

Figure 62 shows a control flowchart for accesses per- 
formed by the Am29000. This control flow applies inde- 
pendently to both instruction and data accesses. Since 
the processor performs concurrent instruction and data 
accesses, these accesses may be at different points in 
the control flow at any given point in time. 

Note that the items on the flowchart of Figure 62 do not 
represent actual states and have no particular relation- 
ship to processor cycles. The flowchart provides only a 
high-level understanding of the control flow. Also, ex- 
ceptions and error conditions are not shown. 

The channel supports three protocolsfor accesses: sim- 
ple, pipelined, and burst-mode. These are described in 
the following sections. The various protocols are de- 
fined to accommodate minimum-latency accesses as 
well as maximum-transfer-rate accesses. The protocols 
allow an access to complete in a single cycle, although 
they support accesses requiring arbitrary numbers of 
cycles. Address transfers for accesses may be inde- 
pendent of instruction or data transfers. 
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Simple Accesses 

For a simple access, the processor holds the address 
valid throughout the entire access. This protocol is used 
for single-cycle accesses, and for accesses to simple 
devices and menrwries. 

On any cycle before the completion of the access, a sim- 
ple access may be c onverted to a pipelined access (by 
the assertio n of PEN ) orto a b urst-mode access (by the 
ass ertion o f I BACK or DBACK, if the processor is assert- 
ing IBREQ or DBREQ). Thus, the protocol for simple ac- 
cesses also may be used during the initial cycles of 
pipelined and/or burst-mode accesses. This is advanta- 
geous, for example, in cases where the slave device or 
memory either requires the address to be held for multi- 
ple cycles at the beginning of the pipelined or burst- 
mode access, or cannot respond to the pipelined or 
burst-mode request within one cycle. 

Pipelined Accesses 

A pipelined access is one that starts before an earlier in- 
progress accesses completed. The in-progress access 
is called a primary access and the second access is 
called a pipelined access. A pipelined access is of the 
same type as the primary access. For example, an in- 
struction access that begins before the completion of a 
data access is not considered to be a pipelined access, 
whereas a second data access is. 

The Am29000 allows only one pipelined access at any 
given time. 

Tradeoffs 

For accesses that require nwre than one cycle to com- 
plete, pipelined accesses perform better than simple ac- 
cesses because they allow the overlap of portions of two 
accesses. In addition, the ability to latch addresses in 
support of pipelined accesses reduces utilization of the 
address bus, thereby reducing contention between in- 
struction and data accesses. However, devices and 
memories that support pipelined accesses are some- 
what more complex than devices and memories that 
support only simple accesses. 

Support for pipelined operations is required for t>oth the 
primary access and the pipelined access. The slave per- 
forming the primary access must contain some means 
for storing the address and other information at)Out the 
access. The slave performing the pipelined access must 
be able to restrict its use of the instmction bus or data 
Bus, and must be prepared to cancel the access (as ex- 
plained below). 

Pipelined Operation 

Pipelined acces ses are controlled by the signals PEN, 
PIA, and PDA. Because of internal data-flow con- 
straints, the Am29000 does not perform a pipelined 
store operation while a load is in progress. However, the 
protocol does not restrict pipelined operations. Other 
channel masters may perform a pipelined store during 
a load. 
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Except as noted above, the processor at temp ts to per- 
form pipelining for every access; the input PEN indicates 
whether or not pipelining is supported for a given ac- 
cess. The PEN input can be driven by individual devices, 
or can be tied active or inactive to enable or disable sys- 
tem-wide pi pelin ed accesses. The processor ignores 
the value of PEN unless it is performing an access. 

The processor samp les PEN on every cycle during a pri- 
mary access. If PEN is active on any cycle, the proces- 
sor ceases to drive the address and associated controls 
for the primary access in the next cycle. If the processor 
requires another access before the primary access is 
completed, it drives the addre ss an d controls for the 
second access, asserting PI A or PDA to indicate that the 
second access is a pipelined access. 



The output IREQ or DREQ, as appropriate, is not as- 
serted for a pipelined access. Devices and memories 
that cannot support pi peline d accesses should there- 
fore i gnore PIA a nd/or P DA, and base their operation 
upon IREQ and/or DREQ. 

A device or memory that receives a request for a 
pipelined access may treat it as any other access, with 
one exception: the pipelined access cannot use the In- 
struct ion a nd data b uses or the associated controls 
(e.g., IRDY or DRDY). In the case of a data read or in- 
struction access, the results of the pipelined access 
cannot be driven on the appropriate bus. In the case of a 
data write, the data do not appear on the data bus. Any 
otheroperationsforthe access, such as address decod- 
ing, can occur. 

When th e prima ry access is completed (as indicated by 
IRDY or DRDY), the pipelined access becomes a pri- 
mary ac cess. T he processor indicates this by asserting 
IREQ or DREQ, depending on the type of access. The 
device or memory performing the pipelin e d acces s may 
complete the access as soon as IREQ or DREQ is as- 
serted (possibly in the same cycle). When the access 
becomes a primary access, it controls the channel as 
any other primary access. For example, it may deter- 
mine whether or not another pipelined access can be 
performed. 

When the pipelin ed ac cess becomes a primary access, 
the output PIA or PDA remains asserted for one cycle to 
ensure continuity of contro l with in t he slav e device or 
memo ry. In the cycle after IREQ or DREQ is asserted, 
PiAor PDA is deasserted unless the processor initia tes 
another pipelined access, in which case PIA or PDA re- 
mains asserted for the new access. 

Cancellation of Pipelined Accesses 

If the processor takes an interrupt or trap before a 
pipelined access becomes a primary access, the re- 
quest for the pipelined access is removed f rom t he 
chann el. This may occur, for example, when lERR or 
DERR is signaled for the primary access. 



If the pipelined access is removed from the cha nnel, t he 
slave d evice or menwry does not receive an IREQ or 
DREQ for the pipelined access. Hence, the pipelined ac- 
cess does not become a primary access, and cannot be 
completed. A pipelined access may be canceled in this 
manner at any time before it becomes a primary access. 
Because of this, a pipelined access should not change 
the state of a slave device or memory until the pipelined 
access becomes a primary access. 

Burst-Mode Accesses 

A burst-mode access allows multiple instructions or 
data words at sequential addresses to be accessed with 
a single address transfer. The number of accesses per- 
formed and the timing of each access within the se- 
quence are controlled dynamically by the burst-mode 
protocol. Burst-mode accesses take advantage of se- 
quential addressing patterns, and provide several bene- 
fits over simple and pipelined accesses: 

1. Simultaneous instruction and data accesses. 
Burst-mode accesses reduce the utilization of 
the address bus. This is especially important for 
instruction accesses, which are normally se- 
quential. Burst-mode instmction accesses elimi- 
nate most of the address transfers for instruc- 
tions, allowing the address bus to be used for si- 
multaneous data accesses. 

2. Faster access times. By eliminating the ad- 
dress-transfer cycle, burst-mode accesses al- 
low addresses to be generated in a manner that 
improves access times. 

3. Faster memory access modes. Many memories 
have special high-bandwidth access modes 
(e.g., fast page mode DRAM). These modes 
generally require a sequential addressing pat- 
tern, even though addresses may not be pre- 
sented explicitly to the memory for all accesses. 
Burst-mode accesses allow the use of these ac- 
cess modes without hardware to detect sequen- 
tial addressing patterns. 

Burst-Mode Overview 

The control-flow diagrams in Figure 63 and Figure 64 il- 
lustrate the operation of the processor and an instmc- 
tion memory during a burst-mode instmction access. 
The control-flow diagrams in Figure 65 and Figure 66 il- 
lustrate the operation of the processor and a data mem- 
ory or device during a burst-mode data access. These 
diagrams are for illustration only; nodes on these dia- 
grams do not necessarily correspond to processor or 
slave states, and transitions on these diagrams do not 
necessarily correspond to processor cycles. 
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Figure 63. Processor Burst-Mode Instruction Accesses: Control Flow 



A burst-mode access is in one of tlie following opera- 
tional conditions at any given time: 



1. Established: 



2. Actlve: 



3. Suspended: 



The processor and slave device 
have successfully initiated the 
burst-mode access. A burst- 
mode access that has been es- 
tablished is either active or sus- 
pended. An established burst- 
mode access may become 
preempted, terminated or can- 
celed. 

Instruction or data accesses and 
transfers are being performed 
as the result of the burst-mode 
access. An active burst-mode 
access may become sus- 
pended. 

No accesses ortransf ers are be- 
ing performed as the result of 



the burst-mode access, but the 
burst-mode access remains es- 
tablished. Additional accesses 
and transfers may occur at 
some later time (i.e., the burst- 
mode access may become ac- 
tive) without the retransmission 
of the address for the access. 

4. Preempted: The burst-mode access can no 

longer continue because of 
some condition, but the burst- 
mode access can be re- 
established within a short 
amount of time. 

5. Terminated: All required accesses have 

been performed. 

6. Canceled: The burst-mode access can no 

longer continue because of 
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Note: A similar state transition may be used to support suspended burst-mode data accesses 
or a channel master other than the processor. 

Figure 64. Slave Burst-Mode Instruction Accesses: Control Flow 



some exceptional condition. 
The access may be re- 
established only after the excep- 
tional condition has been cor- 
rected, if possible. 

Each of the above conditions, except for the terminated 
condition, is under the control of both the processor and 
slave device or memory. The tenninated condition is 
determined by the processor, because only the proces- 
sor can determine that all required accesses have been 
performed. The following sections discuss each of the 
at>ove conditions with respect to the burst-mode 
protocol. 

Establishing Burst-Mode Accesses 

The Am29000 attempts to perfomi all instruction 
prefetches using burst-mode accesses, except for in- 
stnjction fetches at the last word before a 1-kb address 
boundary. For data accesses, the processor attempts to 
perform Load Multiple and Store Multiple operations us- 
ing burst-mode accesses. The processor indi cates th at 
it desires a burst-mode access by asserting IBREQ or 



DBREQ during the cycle in which the initial address is 
placed on the address bus (however, note that these 
signals become valid later in the cycle than the ad- 
dress). 



The inputs IBACK and DBACK indicate that a requested 
burst-mode access is supp orted. T he processor ignores 
the value of IBAC K unless IBREQ is assert ed, and it ig- 
nores the value of DBACK unless DBREQ is asserted. 

When it desires a burst- mo de acce ss, the processor 
continues to drive IBREQ or DBREQ on every cycle for 
which the address is valid on the address bus. During 
this time, t he devic e or memo ry involved in the access 
may assert IBACK or DBACK to indica t e that it ca n per- 
fonnthe burst-mode access. If IBACK or DBACK (as ap- 
propriate) is asserted while the initial address appears 
on the address bus, the burst-mode access is estab- 
lished. In the following cycle, the proces so r remov es the 
request address and deas serts IR E Q or DRE Q. How- 
ever, it continues to assert IBREQ or DBREQ. 

If the burst-mode access is not established on the first 
access, the processor attempts to establish a burst- 
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Note: The Am29000 does not suspend burst-mode data accesses. 
Figure 65. Processor Burst-Mode Data Accesses: Control Flow 



mode access on each subsequent address transfer, as 
long as there are more accesses yet to be performed. 
During any subsequent access, the addressed device or 
me mory m ay establis h a burst-mode access by assert- 
ing IBACK or DBACK. If the burst-mode access is never 
established, the default behavior is to have the proces- 
sor transmit an address for every access. 

Active and Suspended Burst-Mode Accesses 



After the burst-mode access is established, IBREQ and 
DBREQ are used during subsequent accesses to indi- 
cate th at the p ro cessor r equires at least one more ac- 
cess. If IBREQ or DBREQ is active at the end of the cycle 
in wh ic h an acc ess is successfully completed (i.e., when 
I RD Y or DRD Y is active) , the processor requ ires another 
access. If the slave device or memory previously has 
not preempted the burst-mode access, and does not 



preempt (by deass ert ing IBA CK or DBACK) or cancel 
(by asserting lERR or DERR) the burst-mode access in 
the cycle that the access completes, the additional ac- 
cess must be perfomned. 

The execution rate of instructions is known only dynami- 
cally, so that in certain situations, a b urst-mo de instruc- 
tion access must be suspended. If IBREQ is inactive 
during the cycle in which an instruction access is com- 
pleted, the burst-mode access is suspended (if it is nei- 
ther preempted nor canceled at the same time). The 
burst-mode access remains suspended unless the 
proc essor r equests a new instruction access (in which 
case IREQ is asserted), or unless the instruction mem- 
ory preempts the burst-mode access. 

A suspended burst-mode instruction access becomes 
active wheneverthe processor can accept more instruc- 
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Figure 66. Slave Burst-Mode Data Accesses: Control Flow 



tions. The p rocesso r activates the burst-mode access 
by asserting IBREQ. If the instruction memory does not 
preempt the burst-mode access during this cycle, an in- 
struction access must be performed. 

When a suspended burst-mode instruction access is ac- 
tivated, the resulting instruction acc ess is no t permitted 
to be completed in the cycle in which IBREQ is asserted, 
but may be completed in the next cycle. The reason for 
this restriction is that the burst-mode protocol is defined 
suc h that the combination of an active level on IBREQ 
and IRDY causes an instruction access (as previously 
discussed). If the instruction access is completed imme- 
diately in the cycle where a suspended burst-mode ac- 
cess Is activated, there is an ambiguity in the pro tocol: it 
is possible to interpret a single-cycle assertion of IBREQ 
as a request for two instructions. 

The above ambiguity is resolved by delaying the instruc- 
tion access resulting from a reactivated burst-mode ac- 
cess for a cycle. Since this restriction applies only when 
the Instruction Prefetch Buffer is full and the instruction 
memory is capable of a veryfast access, the delayed in- 
stmction response has no performance impact. 

The Am29000 does not suspend burst-mode data ac- 
cesses because the data transfers occur to and from 
general-purpose registers, which are always available. 
However, other channel masters may suspend burst- 
mode data accesses (during direct memory accesses. 



for example). The principles for suspending burst-mode 
accesses are the same as those for instmction ac- 
cesses discussed above. 

Processor Preemption, Termination, 
and Cancellation 

The processor may preempt, ter minate o r cancel a 
burst-mode a ccess by Reasserting IBREQ or DBREQ 
and asserting IREQ or DREQ at some later point. Nor- 
mally, the proc essor re ce ives one nrwre instruction or 
data word after IBREQ or DBREQ is deasserted. How- 
eve r, this a cc ess may be completed in the same cycle 
that I BREQ o r DB REQ i s deasserted. During the period 
af ter IBR EQ or DBREQ is deasserted and before IREQ 
or DREQ is asserted, the burst-mode access is in a sus- 
pended condition. 

The slave device or memory cannot distinguish be- 
tween preempted, terminated, and canceled burst- 
mode accesses, when these are caused bythep roces- 
sor, until the processor asserts I REQ or D RE Q. If the 
sl ave conti nues to assert IBACK or DBACK after IBREQ 
or DBREQ is deasserted, the slave should be prepared 
to acc e pt any new request during the cycle in which 
IREQ or DREQ is asserted to begin the new access. The 
reason for this is that the processor may attempt to es- 
tablish a burst-m ode acc es s for the new access: if the 
slave is asserting IBACK or DBACK because of a previ- 
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ously preempted, terminated, or canceled b urst-mo de 
access, the processor interprets tlie active I BACK or 
DBACK as establishing the new burst-mode access and 
removes the request in the following cycle. 

The processor preempts a burst-mode access when an 
external channel master arbitrates for the channel, or 
when a burst-mode fetch crosses a potential virtual- 
page boundary. Since the minimum page size is 1 kb, 
burst-mode instruction and data accesses are pre- 
empted whenever the address sequence crosses a 1 -kb 
address tx)undary. The burst is reestablished as soon 
as a new address translation is performed (if required). 
A new physical address is transmitted when the burst- 
mode access is reestablished. 

Note that the preemption resulting from page bound- 
aries is advantageous for devices or memories that 
require counters to follow the burst-mode address 
sequence. Since all burst-mode accesses are word 
accesses and the processor retransmits an address at 
every 1-kb address boundary, an 8-bit counter in the 
slave device or memory is sufficient to follow the burst- 
mode address sequence. Additional address bits are 
simply latched. 

The processor terminates a burst-mode access when- 
ever all required instructions or data have been ac- 
cessed. In the case of instruction accesses, the burst- 
mode access is terminated when a nonsequential fetch 
occurs. In the case of data accesses, the burst-mode 
access is terminated when the count indicates a single 
load or store remains. The last load or store is executed 
as a simple access. 

The processor cancels a burst-mode access when an 
interrupt or trap is taken. Note that a trap may be caused 
by the burst-nrx)de access, for example when a Transla- 
tion Look-Aside Buffer miss occurs on an address in the 
burst-mode sequence. If the processor cancels a burst- 
mode access when an access in the sequence remains 
to be completed, this access must be completed in spite 
of the cancellation. 

Canceled burst-mode data accesses may be restarted 
at some (possibly much later) point in execution via the 
Channel Address, Channel Data, and Channel Control 
registers. In this case, the burst-mode access is re- 
started at the point at which it was canceled, rather than 
at the beginning of the original address sequence. 

Slave Preemption and Cancellation 

The slave device or memory involved in a burs t-mode 
ac cess ma y preempt the access by deasse rtin g IBACK 
or DB ACK. T he processor samples IBACK and DBA CK 
when IR DY and DRDY are active so that IBACK and 
DBACK may be deasserted as the l ast s upported ac- 
cess is completed. However, IBACK and DBACK also 
may be dea sserted in any cyc le before the access is 
completed. If IBACK or DBACK is deasserted when the 
processor is in a state where it expects an access, the 
access must be completed. 
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In general, the slave device or memory preempts the 
burst-mode access whenever it cannot support any fur- 
ther accesses in the burst-mode sequence. This nor- 
mally occurs whenever an implementation-dependent 
address boundary is encountered (e.g., a cache-block 
boundary), but may occur for any reason. By preempt- 
ing the burst-mode access, the slave receives a new re- 
quest with the address of the next instruction or data 
word required by the processor. 

The slave device or menro ry may c ancel a burst-mode 
access by asserting lERR or DERR in respons e to a re- 
quested access. The signals IBACK or DBACK need not 
be deasserted at this time, but should be deasserted in 
the next cycle. 

Note that the I ERR and DERR signa ls cau se non-mask- 
able traps, except in the case where I ERR is asse rted for 
an instruction that the processor does not execute. 

Arbitration 

External masters can gain access to t he add ress, data, 
and instmction buses by asserting the BREQ input. The 
processor completes any pending acc ess, pr eempts 
any burst-mode access, and asserts the BGRT output. 
At this time, the processor places all channel outputs as- 
sociated with the address, data, and instruction buses in 
the high-impedance state. 



For th e first cycle in which BGRT is asserted, the output 
BINV is also asserted. If the external master cannot con- 
trol the address bus and associated controls i n the c ycle 
where BGRT is asserted, the active level on BINV may 
be used to define an idle cycle forthe chan nel (i.e .. any 
spurious access requests are ignored). The BINV signal 
is asserted only for a single cycle, so the external master 
must take control of the channel in the cycle after BGRT 
is asserted. 



While the BREQ in put rem ains asserted, the processor 
continues to assert BGRT. The external master has con- 
trol over the channel during this time. 

To release the ch annel t o the processor, the external 
master deasserts BREQ, but must continue to cont rol 
the channel for the first c ycle in which BREQ is 
deasserted. In the cycle after BREQ i s deass erted, the 
processor asserts BINV and deasserts BGRT; the exter- 
nal master should release control of the channel at this 
time. On the following cycle, the processor deasserts 
BINV and is able to use the channel. The processor 
reestablishes any burst-mode access preempted by 
arbitration. 

The pr ocessor does not relinquish the channel when the 
LOCK signal is active. This prevents external masters 
from interfering with exclusive accesses. 
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Use of BINV to Cancel an Access 



Besides using the BINV signal to transfer control of the 
ch annel f rom one masterto another, the Am29000 uses 
the BIN Vsignal to cancel acce sses a fterthey have been 
initiated. To ca ncel a n access, BINV is asserted during a 
cycle in which IREQ or DREQ also is asserted. If an ac- 
cess^ ls_canceledjlhe_accompanying response (using 
IRDY, I ERR, DRDYor DERR) is ignored during the cycle 
where BINV is asserted; thereafter, the system should 
not respond to the canceled access. 

The BINV signal is used to cancel an instruction access 
in the following situations: 

■ when an interrupt or trap is taken 

■ when an instruction fetch-ahead Is canceled 
because a target block is only partially present 
in the Branch Target Cache 

■ when an instruction TLB miss or protection 
violation occurs on an instruction access 

■ when a branch instruction is the delay instmc- 
tion of another branch, and the targets of both 
branches are in the Branch Target Cache (in 
this case, the external fetch for the target of 
the first branch is not required) 

■ when the processor enters the Load Test In- 
struction fwlode, and there is an active instmc- 
tion request on the channel 

The BINV signal is used to cancel a data access in the 
following situations: 

■ when a data TLB miss or protection violation 
occurs on the data access 

■ when an interrupt or trap is taken in the cycle 
where a pipelined data access becomes a pri- 
mary access 

If, for data accesses, address translation is not per- 
for med a nd pipelined accesses are not implemented, 
the BINV signal can be ignored by the system during the 
access. 

When a LOADSET instruction encounters a protection 
violation because store access is not perm itted, the 
processor cancels the load access with BINV. 

Bus Sharing— Electrical Considerations 

When buses are shared among multiple masters and 
slaves, it is importantto avoid situations wherethesede- 
vices are driving a bus at the same time. This may occur 
when more than one master or slave is allowed to drive a 
bus in the same cycle if bus arbitration is incompletely or 
incorrectly performed. However, it also occurs when a 



master or slave releases a bus in the same cycle that an- 
other master or slave gains control, and the first master 
or slave is slow in disabling its bus drivers, compared to 
the point at which the second master or slave begins to 
drive the bus. The latter situation is called a bus collision 
In the following discussion. 

In addition to the logical errors that can occur when mul- 
tiple devices drive a bus simultaneously, such situations 
may cause bus drivers to carry large amounts of electri- 
cal current. This can have a significant impact on driver 
reliability and power dissipation. Since bus collisions 
usually occur for a small anrwunt of time, they are of less 
concern, but may contribute to high-frequency electro- 
magnetic emissions. 

The Am29000 channel is defined to prevent all situ- 
ations where multiple drivers are driving a bus simulta- 
neously. However, bus collisions may be allowed to oc- 
cur, depending on the system design. 

In the case of the Am29000 channel, arbitration for the 
channel prevents the processor from driving the ad- 
dress and data buses at the same time as another chan- 
nel master. If there is more than one external master, 
the system design must include some means for ensur- 
ing that only one external master gains control of the 
channel, and that no external master gains control of the 
channel at the same time as the processor. 

When the processor relinquishes control of the channel 
to an external master, bus collisions may be prevented 
by n ot allo wing the external master to drive any bus 
while BINV is active. This ensures that all processor out- 
puts are disabled by the time the external master takes 
control of the channel. However, there is nothing in the 
channel protocol to prev ent the external master from 
taking control as soon as BGRT is asserted. 

Slave devices and memories are prevented from simul- 
taneously driving the instoiction bus or data bus by 
allowing only the device or memory performing a pri- 
mary access to drive the appropriate bus. When a 
pipelined access becomes a primary access, it may 
drive the instruction or data bus immediately, so there is 
a potential bus collision if the pipelined access is 
performed by a slave other than the slave performing 
the original primary access. This bus collision may be 
prevented by restricting all slaves to driving the instruc- 
tion and data buses in the second half-cycle (using 
SYSCLK, for example). Since the processor samples 
data only at the end of a cycle, this restriction does not 
affect performance. 

When the processor performs a store immediately fol- 
lowing a load, it drives the data bus for the store in the 
second cycle following the cycle in which the data for the 
load appears on the data bus. This provides a complete 
cycle for the slave involved in the load to disable its data 
drivers. The proce ssor co nti nues to drive the data bus 
until it receives a DRDY or DERR in response to the 
store; it ceases to drives the data bus in the cycle follow- 
ing the response. 
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Channel Behavior for Interrupts 
and Traps 

If an interrupt or trap is taken, any burst-mode accesses 
are canceled. If a request for a pipelined access is on thie 
address bus, thiis request is removed. Any other ac- 
cesses are completed and no new accesses are started, 
otfier than those required for the intermpt or trap. Note 
that any accesses that the processor expects to com- 
plete must be completed, even though burst-mode and 
pipelined accesses are canceled. 

When interrupt or trap processing is complete, any can- 
celed burst-mode access transactions are reestab- 
lished using the address of the access that was to be 
performed next when the interrupt or trap was taken. 
Uncompleted pipelined accesses are restarted, either 
by the interrupt return sequence in the case of an in- 
stmction access, or by restarting the initiating instruction 
in the case of a data access. 

Note that the restarting of a pipelined access is not per- 
formed by the Channel Address, Channel Data, and 
Channel Control registers, since these registers may be 
required to restart the primary access. The instruction 
initiating the pipelined access is not allowed to be com- 
pleted until the primary access is completed, so that the 
Program Counter 1 (PC1) register contains the address 
of the initiating instmction when a pipelined access is 
canceled. The address in PCI can restart this instruc- 
tion on interrupt return. 



Effect of the LOCK Output 

The LOCK output provides synchronization and exclu- 
sion o f accesses in a multiprocessor environment. 
LOCK has no predefined effect for a system, other than 
the fact that the Am2900 does n ot grant the channel to 
an external master while LOCK is active. 



The LOCK output is asserted for the address cycle of the 
Load-and-Lock and Store-and-Lock instmctions, and is 
asserted for t)Oth th e read and write accesses of a Load 
and Set instruction. LOCK may also be active for an ex- 
tended period of time under control of the Lock bit in the 
Current Processor Status Register (this capability is 
available only to Supervisor-nnode programs). 

LOCK may be defined to provide any level of resource 
locking for a particular system. For example, it may lock 
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the channel, an individual device or memory, or a loca- 
tion within a device or memory. 

When a resource is locked, it is available for access only 
by the processor with the appropriate access privilege. 
The mechanisms for restricting accesses and the meth- 
ods for reporting attempted violations of the restrictions 
are system-dependent. 

Initialization and Reset 

When power is first applied to the processor, it is in an 
unknown state and must be placed in a known state. 
Also, under certain circumstances, it may be necessary 
to place the processor in a defined state. This is accom- 
plished b^Uie_Reset mode, which is invoked by activat- 
ing the RESET pin for the required duration. The Reset 
mode configures the processor state as follows: 

1. Instruction execution is suspended. 

2. Instaiction fetching is suspended. 

3. Any interrupt or trap conditions are ignored. 

4. The Current Processor Status Register is set as 
shown in Figure 67. 

5. The Cache Disable bit of the Configuration Reg- 
ister is set. 

6. The Data Width Enable bit of the Configuration 
Register is reset. 

7. The Contents Valid bit of the Channel Control 
Register is reset. 

Except as previously noted, the contents of ail general- 
purpose registers, special-purpose registers, and TLB 
registers are undefined. The contents of the Branch Tar- 
get Cache are also undefined. 

The Reset mode also configures the processor to initi- 
ate an instaiction fetch using an address of 0. Since the 
ROM enable (RE) bit of the Current Processor Status is 
1 , this fetch is directed to external instruction read-only 
memory. This fetch occurs when the Reset mode is 
exited (i.e., when the RESET input is deasserted). 



The Reset mode is invoked by asserting the RESET in- 
put and can be entered only if the SYSCLK pin is operat- 
ing normally, whether or not the SYSCLK pin is being 
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driven by the processor. The Res et mod e is entered 
with in four processor cycles after RESET is asserted. 
The RESET input must l>e asserted for at least four pro- 
cessor cycles to accomplish a processor reset. 

The Reset mode can be entered from any other proces- 
sor mode (e.g.,the Reset m ode can be entered from the 
Halt nx)de). If the RESET input is asserted at the time 
that power is first applied to the processor, the proces- 
sor enters the Reset mode only after four cycles have 
occun-ed on the SYSCLK pin. 



The Reset mode is exited when the RES ET input is de- 
asserted. Either three or four cycles after RESET is de- 
asserted (depending on internal synchronization time), 
the processor performs an initial instruction access on 
the channel. The initial instruction access is directed to 
Address in the instruction read-only memory (instruc- 
tfon ROM). If instruction ROM is not implemented in a 
particular system, another device or memory must re- 
spond to this instruction fetch. 



If the CNTLt-CNTLo inputs are 10 or 01 when RESET is 
deasserted, the processor enters the Halt or Step mode, 



respectively. If the processor enters the Halt mode im- 
mediately after reset, the protection checking that nor- 
mally applies to the Halt instruction is disabled so that 
the Halt instmction can be used as an instruction break- 
point in a User-mode program. The Load Test Instruc- 
tion nrK>de cannot be directly entered from the Reset 
mod e. If the CNTL1-CNTL0 inputs are 00 immediately 
after RESET Is deasserted, the effect on processor op- 
eration is unpredictable. If the CNTL1-CNTL0 inputs are 
1 1 , the processor enters the Executing mode. 

The processo r samples the STATo output intemally 
when RESET is asserted. A High level on STATo in this 
case is used to enable a special test configur ation an d 
causes the processor to be Inoperable. When RESET is 
asserted, the processor drives STATo Low in order to 
disable this test configuration. However, if processor 
outputs are disabled by the Test mo de, the processor is 
not able to drive STATo. Thus, if RESET is asserted 
When the processor is in the Test mode, the STATo pin 
must be driven Low externally. (In a master/slave con- 
figura tfon, ST ATo is driven Low by the master processor 
when RESET is asserted.) 
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ABSOLUTE MAXIMUM RATINGS 

Storage Temperature -65 to +1 50°C 

Voltage on any Pin 
with Respect to GND -0.5 to Vcc +0.5 V 

Stresses above those listed under ABSOLUTE MAXI- 
MUM RATINGS may cause permanent device failure. 
Functionality at oratjove these limits is not implied. Ex- 
posure to absolute maximum ratings for extended peri- 
ods may affect device reliability. 



OPERATING RANGES 

Commercial (C) Devices 

Case Temperature (To) 
Supply Voltage (Vcc) 

Military Devices 

Case Temperature (Tc)* 
Supply Voltage (Vcc) 

Operating ranges define those limits between which the 
functionality of the device is guaranteed. 
'measured "instant on" 



to +85°C 
+4.75 to +5.25 V 

-55to+125<*C 
+4.5 to +5.5 V 



DC CHARACTERISTICS over COMMERCIAL and MILITARY operating ranges 



Parameter 
Symbol 


Parameter 
Description 


Test Conditions 


iViln. 


Max. 


Unit 


VlL 


Input Low Voltage 




-0.5 


0.8 


V 


VlH 


Input High Voltage 




2.0 


Vcc +0.5 


V 


ViLINCLK 


INCLK Input Low Voltage 




-0.5 


0.8 


V 


VlHINCLK 


INCLK Input High Voltage 




2.0 


Vcc +0.5 


V 


ViLSYSClX 


SYSCLK Input Low Voltage 




-0.5 


0.8 


V 


VlHSYSCLK 


SYSCLK Input High Voltage 




Vcc -0.8 


Vcc +0.5 


V 


Vol 


Output Low Voltage for 
All Outputs except SYSCLK 


loL= 3.2 mA ,> 


'A 


0.45 


V 


VOH 


Output High Voltage for 
All Outputs except SYSCLK 


loH — 400 nA . '^?^ 


'•,y 
^ 2.4 




V 


lu 


Input Leakage Current 


0.45V <ViN^ Vcc -0.45V ^ 




±10 


UA 


Ilo 


Output Leakage Current 


0.45V <VouT< Vcc ^.45V 




±10 


^A 


Iccop 


Operating Power-Supply 
Current 


Vcc= 5.25V, Outputs 
Floating; Holding RESET 
active with externally 
supplied SYSCLK 




22 for 

Commercial 

25 for 

Military 


mA/MHz 


VOLC 


SYSCLK Output Low Voltaqe 


loLc=20mA 




0.6 


V 


VOHC 


SYSCLK Output High Voltage 


IOMCa20mA 


Vcc -0.6 




V 


loSGNO 


SYSCLK GND Short 
Circuit Current 


,Vcc = 5.0V 


100 




mA 


losvcc 


SYSCLK Vcc Short 
Circuit Current 


Vcc = 5.0 V 


100 




mA 



CAPACITANCE 



Parameter 
Symbol 


Parameter 
Description 


Test Conditions 


MIn. 


Max. 


Unit 


CiN 


Input Capacitance 


fC = 1 MHz (Note 1) 




15 


PF 


CiNCU 


INCLK Input Capacitance 




20 


PF 


CsYsaK 


SYSCLK Capacitance 




90 


pF 


COUT 


Output Capacitance 




20 


pF 


Ci/o 


I/O Pin Capacitance 




20 


pF 



Note:1. Not 100% tested. 
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SWITCHING CHARACTERISTICS over COMMERCIAL operating range 




Parameter 
Description 


Test 
Conditions 


33 MHz 


25 MHz 




No. 


MIn. 


Max. 


MIn. 


Max. 


Unit 


1 


System Clock (SYSCLK) 
Period (T) 


Notel 






40 


1000 


ns 


1A 


SYSCLK at 1.5V to SyScLK 
at 1.5V when used as an output 


Note 13 






0.5T-1 


0.5T+1 


ns 


2 


SYSCLK High Time when used as input 


Note 13 






19 




ns 


3 


SYSCLK Low Time when used as input 


Note 13 






17 




ns 


4 


SYSCLK Rise Time 


Note 2 








5 


ns 


5 


SYSCLK Fall Time 


Note 2 








5 


ns 


6 


Synchonous SYSCLK Output 
Valid Delay 


Notes 3. 12 


i 


>4\ 


3 


14 


ns 


6A 


Synchronous SYSCLK Output 
Valid Delay for D3,-Do 


Note 12 


45. 




4 


18 


ns 


7 


Three-State Synchronous SYSCLK 
Output Invalid Dejay 


Notes 4, 
14.15 


%^ 


> 


3 


30 


ns 


8 


Synchronous SYSCLK 
Output Valid Delay 


Notes 5, 12^ 






3 


14 


ns 


8A 


Three-State SYSCLK 
Synchronous Output Invalid Delay 


Notesj^*^ 
14.1^^^' 






3 


30 


ns 


9 


Synchronous Input Setup Time 


jmim^^ 


> 




12 




ns 


9A 


Synchronous Input Setup Time 
for D„-Do. l3,-l« 


w^"* 






6 




ns 


9B 


Synchronous Input Setup Time /}' 
forbhbY ^ 


t y 






13 




ns 


10 


Synchronous Input Hold Time ^"^ ^ 


^^iNotee 






2 




ns 


11 


Asynchronous Input Minimum ^%\/)^%; 
Pulse Width /^^^%\ 


"^ Note 8 






T+10 




ns 


12 


INCLK Period ^<^ ^ 








20 


500 


ns 


12A 


INCLK to SYSCLK Delay ^% 








2 


10 


ns 


12B 


INCLK to SYSCLK Delay 








2 


10 


ns 


13 


INCLK Low Time 








8 




ns 


14 


INCLK High Time 








8 




ns 


15 


INCLK Rise Time 










5 


ns 


16 


INCLK Fall Time 










5 


ns 


17 


INCLK to Deassertbn of RESET 

(for phase synchronization of SYSCLK) 


Note 9 









5 


ns 


18 


WARN Asynchronous Deassertion 
Hold Minimum Pulse Width 


Note 10 






4T 




ns 


19 


BIN V Synchronous Output Valid 
Delay from SYSCLK 


Note 12 






1 


7 


ns 


20 


Three-State synchronous SYSCLK 
output invalid delay for D^-Dg 


Notes 11. 
14. 15 






3 


20 


ns 
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SWITCHING CHARACTERISTICS over COMMERCIAL operating range 




Parameter 
Description 


Test 
Conditions 


20 MHz 


16 MHz 




No. 


Min. 


Max. 


Min. 


Max. 


Unit 


1 


System Clock (SYSCLK) 
Period (T) 


Notel 


50 


1000 


60 


1000 


ns 


1A 


SYSCLK at 1 .5V to SYSCLK 
at 1.5V wiien used as an output 


Note 13 


0.5T-1 


0.5T+1 


0.5T-2 


0.5T+2 


ns 


2 


SYSCLK High Time when used as input 


Note 13 


22 




27 




ns 


3 


SYSCLK Low Time when used as input 


Note 13 


19 




22 




ns 


4 


SYSCLK Rise Time 


Note 2 




5 




5 


ns 


5 


SYSCLK Fall Time 


Note 2 




A 5 




5 


ns 


6 


Synchonous SYSCLK Output 
Valid Delay 


Notes 3, 12 


3 -' 




3 


16 


ns 


6A 


Synchronous SYSCLK Output 
Valid Delay for D3,-Do 


Note 12 


€? 


x-^jzo 


4 


20 


ns 


7 


Three-State Synchronous SYSCLK 
Output Invalid Delay 


Notes 4, 
14, 15 


. %V^' 


> 30 


3 


30 


ns 


8 


Synchronous SYSCLK 
Output Valid Delay 


Notes 5,, 42^ 


^ 


16 


3 


16 


ns 


8A 


Three-State SYSCLK 
Synchronous Output Invalid Delay 


Note#,S^,: 

i4,is%x: 


^'3 


30 


3 


30 


ns 


9 


Synchronous Input Setup Time 


j^m^' 


> 15 




15 




ns 


9A 


Synchronous Input Setup Time 
for D3,-Do. l3,-lo 


'-. -. S^ -t, '"v 


8 




8 




ns 


9B 


Synchronous Input Setup Time /^ 


'-'.."'■- .. * 


16 




16 








for DRDY <;,»,,/ 


ns 


10 


Synchronous Input Hold Time , 'JiJk H 


- : '' Note 6 


2 




2 




ns 


11 


Asynchronous Input Minimum' -.j'tM?i-^4K 
Pulse Width . :>^ 0%^ ^ 


V 

Notes 


T+10 




T+10 




ns 


12 


INCLK Period " > '-^ 




25 


500 


30 


500 


ns 


12A 


INCLK to SYSCLK Delay \ 




2 


12 


2 


15 


ns 


12B 


INCLK to SYSCLK Delay 




2 


12 


2 


15 


ns 


13 


INCLK Low Time 




10 




12 




ns 


14 


INCLK High Time 




10 




12 




ns 


15 


INCLK Rise Time 






5 




5 


ns 


16 


INCLK Fall Time 






5 




5 


ns 


17 


INCLK to Deassertion of RESET 

(for phase synchronization of SYSCLK) 


Note 9 





5 





5 


ns 


18 


WAiW Asynchronous Deassertion 
Hold Minimum Pulse Width 


Note 10 


4T 




4T 




ns 


19 


BINV Synchronous Output Valid 
Delay from SYSCLK 


Note 12 


1 


8 


1 


9 


ns 


20 


Three-State synchronous SYSCLK 
output invalid delay for D3,-Do 


Notes 11, 
14,15 


3 


25 


3 


25 


ns 
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SWITCHING CHARACTERISTICS over MILITARY operating range 




Parameter 
Description 


Test 
Conditions 


20 MHz 


16 MHz 




No. 


Min. 


Max. 


Min. 


Max. 


Unit 


1 


System Clock (SYSCLK) 
Period (T) 


Notel 


50 


1000 


60 


1000 


ns 


1A 


SYSCLK at 1.5V to SYSCLK 
at 1 .5V when used as an output 


Note 13 


0.5T-1 


0.5T+1 


0.5T-2 


0.5T+2 


ns 


2 


SYSCLK High Time when used as input 


Note 13 


22 




27 




ns 


3 


SYSCLK Low Time when used as input 


Note 13 


19 




22 




ns 


4 


SYSCLK Rise Time 


Note 2 




5 




5 


ns 


5 


SYSCLK Fall Time 


Note 2 




5 




5 


ns 


6 


Synchonous SYSCLK Output 
Valid Delay 


Notes 3, 12 


3 


. 16 


3 


16 


ns 


6A 


Synchronous SYSCLK Output 
Valid Delay for D„-Do 


Note 12 


<^ 


.. 20-' 


4 


20 


ns 


7 


Three-State Synchronous SYSCLK 
Output Invalid Delay 


Notes 4, 
14.15 


"'* 3' . • 


- 30 


3 


30 


ns 


8 


Synchronous SYSCLK 
Output Valid Delay 


Notes5. 12r 


, . '3 . ' 


16 


3 


16 


ns 


8A 


Three-State SYSCLK 
Synchronous Output Invalid Delay 


Notesj5:\C 
14.|S;\\ 


, - 3 


30 


3 


30 


ns 


9 


Synchronous Input Setup Time 


m»:tS:^^ 


15 




15 




ns 


9A 


Synchronous Input Setup Time 
for D3,-Do. l3,-lo 




8 




8 




ns 


9B 


Synchronous Input Setup Time a^ 




16 




16 




ns 


10 


Synchronous Input Hold Time ,,,..,^,'' < •'> 


,/>Note 6 


2 




2 




ns 


11 


Asynchronous Input Minimum/';' )'->» 

Pulse Width ..-^?\ "r*'<5 


Notes 


T+10 




T+10 




ns 


12 


INCLK Period <i 1;| ''^';> 




25 


500 


30 


500 


ns 


12A 


INCLK to SYSCLK Delay " H K 






12 




15 


ns 


12B 


INCLK to SYScLk Delay "* 






12 




15 


ns 


13 


INCLK Low Time 




10 




12 




ns 


14 


INCLK High Time 




10 




12 




ns 


15 


INCLK Rise Time 






5 




5 


ns 


16 


INCLK Fall Time 






5 




5 


ns 


17 


INCLK to Deassertion of RESET 

(for phase synchronization of SYSCLK) 


Note 9 





5 





5 


ns 


18 


WARN Asynchronous Deassertion 
Hold Minimum Pulse Width 


Note 10 


4T 




4T 




ns 


19 


BIN V Synchronous Output Valid 
Delay from SYSCLK 


Note 12 


1 


8 


1 


9 


ns 


20 


Three-State synchronous SYSCLK 
output invalid delay for Dji-Do 


Notes 11, 
14,15 


4 


25 


. 4 


25 


ns 
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. Am29000 

Notes: 

1 . AC measurements made relative to 1 .5 V, except where noted. 

2. SYSCLK rise and fall times measured between 0.8 V and (Vcc- 1 .0 V). 

3. Synch ronous Outputs relat ive to SYSCLK rising edge include: Aj,-Ao, BGRT, R/W, SUP/DS, LOCK, MPGM.-MPGMo. 
IREQ. IREQT, PlA. DREQ, DREQVDREQTo. PDA. OPTj-OPTo. STATj-STATo. and MSERR. 

4. Three-state Syn chrono us Outputs relative to SYSCLK risin g ed ge include: Aj.-Ao. R/W, SUP/DS, LOCK, 
MPGM,-MPGMo. TRE5. IREQT, PJA, OREO. DREQT.-DREQTo. PDA. and OPTj-OPTo- 

5. Synchronous Outputs relative to SYSCLK falling edge (SYSCLK): IBREQ, DBREQ. 

6. Synchronous Inputs include: BREO. PER. iRDY, IERR. IbScR. DERR, DBACK. CDA. Ij,-Io. DRDY. and D„-Do. 

7. Synchronous Inputs include: BREQ, PEFI, iRD?, lERR. IBACK. DlRR, DBACK. and CDA. 

8. Asynchronous Inputs include: WARN, IFTrRj-INTRo, TRAPj-TRAP,. and CNTL,-CNTL«. 

9. RESET is an asynchronous input on assertion/deassertion. As an option to the user, RESETdeassertion can be used to 
force t he state of the internal divide-by-two flip-flop to synchronize the phase of SYSCLK (if internally generated) rela- 
tive to RESET/INCLK. 

10. WaRN has a minimum pulse width requirement upon deassertion. 

11 . To guarantee Store/Load with one-cycle memories, Dj^-Do must be asserted relative to SYSCLK falling edge from an 
external drive source. 

1 2. Refer to Capacitive Output Delay table when capacitive loads exceed 80 pF. 

13. When used as an input, SYSCLK presents a 90-pF max. load to the external driver. When SYSCLK is used as an out- 
put, timing is specified with an external load capacitance of < 200 pF. 

1 4. Three-State Output Inactive Test Load. Three-State Synchronous Output Invalid Delay is measured as the time to a 
±500 mV change from prior output level. 

15. When a three-state output makes a synchronous transition from a valid logic level to a high-impedance state, data is 
guaranteed to be held valid for an amount of time equal to the lesser of the minimum Three-State Synchronous Output 

^, Invalid Delay and the minimum Synchronous Output Valid Delay. 

Conditions: 

a. All inputs/outputs are TTL compatible for V|h, Vh^, Vqh. and Vq^ unless otherwise noted. 

b. All output timing specifications are for 80 pF of loading. 

c. All setup, hold, and delay times are measured relative to SYSCLK or INCLK unless otherwise noted. 

d. All input Low levels must be driven to 0.45 V and all input High levels must be driven to 2.4 V except SYSCLK. 
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SWITCHING WAVEFORMS 



SYSCLK 



-o- 



GH 



Vco-I.OV 
1.5V 
0.8 V 



\ 



"-(^ 



SYSCLK 

Synchronous 

Outputs 



SYSCLK 
Synchronous 
Outputs 



1.5 V 



BINV 



Vco-I.OV 
\ 0.8 V 



4-<8>-> 



)F 



<— @-» 



1.5 V 



Synchronous Inputs 



1.5 V 



1.5V 



^Ki)-> 



1.5 V 



K 



1.5 V 



4@> 




Relative to SYSCLK 
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SWITCHING WAVEFORMS 



Am29000 



INCLK 



RESET 



Wm 



Asynchronous 
Inputs 




INCLK and Asynchronous Inputs 
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SWITCHING WAVEFORMS 



<D- 




Vcc-I.OV 

1.5 V 
0.8 V 



■® 







-®- 



Vcc-1.0V 

1.5 V 
0.8 V 



© 




SYSCLK Definition 



^-@-^ 



SYSCLK 



INCLK 




1.5 V 




"\ 



1.5 V 



INCLK to SYSCLK Delay 
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Am29000 

Capacitive Output Delays ~ 

For loads greater than 80 pF 

This table describes the additional output delays for capacitive loads greater than 80 pF. Values in the Maximum 
Additional Delay column should be added to the value listed in the Switching Characteristics table. For loads less 
than or equal to 80 pF. refer to the delays listed in the Switching Characteristics table. 







Total 


Maximum 






External 


Additional 


No. 


Parameter Description 


Capacitance 


Delay 


6 


Synchronous SYSCU< Output Valid Delay 


100 pF f\ 
150 pF xVi 


+1 ns 






+2ns 






200p^,\Xj. 


+4ns 






,.'250,ppl^ \\ 


+6ns 






,A V;^Opr''"-. '^ 


+8ns 


6A 


Synchronous SYSCU< Output Valid Delay for D-.-D^. 


■/'>:■•'% ;HpoW ' 


+1 ns 




A-- 


^ * <-^\> \150pl= 


+6ns 






>, ". \X> "200 pF 


+10 ns 




-.\'.\cvVc 


%\ -* 250 pF 
>^ 300 pF 


+15 ns 




.A 'o'-'^^\- 


+19 ns 


8 


Synchronous SYSCLK Output Valid Delay \\ ^^ 


100 pF 
150 pF 


+1 ns 
+2ns 




-'%,\ -l \\ i---- 


200 pF 


+4ns 




'', -"■' • \ '■! x""' ''' V-''"' 


250 pF 


+6ns 




-■^" ^ \ 


300 pF 


+8ns 


19 


BIN V Synchronous qutput Valid Delay from SYSCLK 


100 pF 


+1 ns 




'^ 


150 pF 


+3ns 






200 pF 


+4ns 






250 pF 


+6ns 






300 pF 


+7ns 



SWITCHING TEST CIRCUIT 




3.2 mA 



X 



0, 



400 ^A 



Am29000 
Pin Under Test 



0M7SB-00IA 

10001030 



Ct. is guaranteed to 80 pF. For capacitive loading greater 
than 80 pF, refer to the Capacitive Output Delay table. 
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Am29000 Thermal Characteristics 
Pin-Grid-Array Package 







A 


9* 














IT iV ,,.;' , li^' ■;' ■■/ 1 1 


r '1 






1 ■ ■ 













9ja"6jc + 6ca 



Thermal Resistance - °C/Watt 





Alrflow^^y|Tilnf^sec) 


Parameter 



(0) . 


150^,, 


m^^ 


\ (2^) 


700 
(3.58) 


900 
(4.61) 


9jc Junction-to-Case 


nf^'J; 


'. -^-y 


Irm 


W 2 


2 


2 


Qf^ Case-to-Ambient (no Heatsink)-x ^ . "" \ ''<' 


\m^ 


\\\^6* '- 


14 


13 


11 


10 


GcA Case-to-Ambient with bhwiyjrflsiaional 4-Fin ' - 
Heatsink,Therma toy, 0417261) \\\'/.'* ' 


10 


6 


3 


2 


2 


2 


Oca Case-to-Ambient (witK unidirectional Pin Fin 
Heatsinl<. Wakefield 840-20) 


10 


6 


3 


2 


2 


2 



Ceramic-Quad-Fiat-Pack Package 



T^ 



e^ 



n 



Thermal Resistance - °C/Watt 



IC001040 





Airflow— ft./mln. (m/sec) 


Parameter 



(0) 


150 
(0.76) 


300 
(t.53) 


480 
(2.45) 


700 
(3.58) 


900 
(4.61) 


9jc Junction-to-Case 














9cA Case-to-Ambient 















Note: This is for reference only. 
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Preliminary 



Am29027 

Arithmetic Accelerator 



Am29027 

Advanced 

Micro 

Devices 



DISTINCTIVE CHARACTERISTICS 

■ High-speed floating-point accelerator for the 
Am29000^** processor 

■ Comprehensive floating-point and Integer 
Instruction sets, including addition, 
subtraction, and multiplication 

■ Single-, double-, and mixed-precision 
operations 

■ Performs conversions between precisions and 
between data formats 

■ Complies with seven industry-standard 
floating-point formats: 

-IEEE Standard for Binary Floating-Point 
Arithmetic (ANSI/IEEE std 754-1985), single- and 
double-precision 

-DEC™ F, DEC D, and DEC G Standards 

-IBM® System/370 single- and double-precision 



Exact IEEE compliance for denormalized 
numbers with no speed penalty 
Simple interface requires no glue logic 
between Am29000 and Am29027™ 
Eight-deep register file for intermediate re- 
sults and on-chip 64-bit data path facilitate 
compound operations, for example, Newton- 
Raphson division, sum-of-products, and 
transcendentals 

Supports pipelined or flow-through operation 

Full compiler and assembler support for IEEE 

format 

Fabricated with Advanced Micro Devices' 1.2- 

mlcron CMOS process 



SIMPLIFIED SYSTEM DIAGRAM 



n 



y' 



P 



Am29027 

Arithmetic 
Accelerator 






^ 



c 



Address 



Am29000 

Streamlined 
Instruction 
Processor 



32 



7y 



C 



Data 



^ 



y- 



^ 



Instruction 
ROM 



^ 



^ 



32 



Instruction 



Instruction 
Memory 



'V Data ^ 
1/^ Mernory >nj 



^ 



32 
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GENERAL DESCRIPTION 

The Am29027 Arittimelic Accelerator is a iiigii- 
performance computational unit intended for use with 
the Am29000 Streamlined Instruction Processor. When 
added to an Am29000-based system, the Am29027 
improves floating-point performance by an order of 
magnitude or more. 

The Am29027 implements an extensive floating-point 
and integer instruction set, and can perform operations 
on single-, double-, or mixed-precision operands. The 
three most widely used floating-point formats— IEEE, 
DEC, and IBM— are supported. IEEE operations fully 
comply with the IEEE Standard for Binary Floating-Point 
Arithmetic (ANSI/IEEE standard 754-1985), with direct 
implementation of special features such as gradual un- 
derflow and exception handling. 

The Am29027 consists of a 64-bit ALU, a 64-bit data 
path, and a control unit. The ALU has three data input 
ports, and can perform operations requiring one, two, or 
three input operands. The data path comprises two 
64-bit input operand registers, an 8-by-64-bit register 
file forstorage of intermediate results, three operand se- 
lection multiplexers that provide for orthogonal selection 
of input operands, and an output multiplexer that 
allows access to the result data, the operation status, 
the flags, or the accelerator state. The control unit inter- 
prets transaction requests from the Am29000, and 
sequences the ALU and data path. 

Operations can be performed in either of two modes: 
flow-through or pipeline. In flow-through mode, the ALU 
is completely combinatorial; this mode is best suited 
to scalar operations. Pipeline mode divides the ALU 
into two or three pipelined stages for use in vector 



operations, such as those found in graphics or signal 
processing. 

The Am29027 connects directly to Am29000 system 
buses and requires no additional interface circuitry. 

Fabricated with AMD's 1.2-micron CMOS technology, 
the Am29027 is housed in two packages: a 169- 
lead pin-grid-array (PGA) package, and a 164-lead 
ceramic-quad-flat-pack (CQFP) package for military 
applications. 

Related AMD Products 



Part No. 



Description 



Am29000 



Streamlined Instruction Processor 



29KT" Family Development Support Products 

Contact your local AMD representative for information 
on the complete set of development support tools. 

Software development products on several hosts: 

■ Optimizing compilers for common high-level 
languages 

■ Assembler and utility packages 

■ Source- and assembly-level software debuggers 

■ Target-resident development monitors 

■ Simulators 
Hardware Development: 

■ ADAPT29KT" Advanced Development and Proto- 
typing Tool 
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CONNECTION DIAGRAMS 






169-Lead PGA* 
Bottom View 








ABCDEFGHJKLMNPRTU 




1 

2 
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Pinout observed from pin side of package. 
•Alignment pin (not connected internally). 



CD009761 
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CONNECTION DIAGRAMS (continued) 
164.Lead CQFP* 



Top view 
(Lid Facing Viewer) 



164 



124 







123 
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PGA PIN DESIGNATIONS (sorted by Pin No.) 








Pin No. 


Pin Name 


Pin No. 


Pin Name 


Pin No. 


Pin Name 


Pin No. 


Pin Name 


A-1 


S31 


C-10 


F20 


J-1 6 


lie 


R-1 2 


DREQTo 


A-2 


F4 


C-11 


Vcco 


J-17 


I1B 


R-1 3 


RESET 


A-3 


Fe 


C-1 2 


GNDO 


K-1 


S9 


R-1 4 


DREQ 


A-4 


Fb 


0-13 


F29 


K-2 


S10 


R-1 5 


I29 


A-5 


Fio 


C-14 


GNDO 


K-3 


GND 


R-1 6 


I27 


A-6 


F12 


C-1 5 


Vcco 


K-1 5 


I21 


R-1 7 


I24 


A-7 


Fu 


C-1 6 


I2 


K-1 6 


I20 


T-1 


R28 


A-8 


F16 


C-17 


l6 


K-17 


|19 


T-2 


R23 


A-9 


Fl8 


D-1 


S24 


L-1 


S8 


T-3 


R21 


A-10 


F21 


D-2 


S25 


L-2 


S7 


T-4 


R18 


A-11 


F22 


D-3 


S23 


L-3 


Se 


T-5 


R16 


A-1 2 


F24 


D-4 


(see note) 


L-1 5 


GNDO 


T-6 


Rl3 


A-1 3 


F27 


D-1 5 


lo 


L-1 6 


I23 


T-7 


Rio 


A-1 4 


F28 


D-1 6 


I3 


L-1 7 


I22 


T-8 


R7 


A-1 5 


F31 


D-17 


Is 


M-1 


Ss 


T-9 


Rs 


A-1 6 


SLAVE 


E-1 


S21 


IVI-2 


S4 


T-10 


R3 


A-1 7 


I1 


E-2 


S23 


M-3 


S2 


T-11 


Ro 


B-1 


S30 


E-3 


S26 


IVi-15 


Vcco 


T-1 2 


OPT1 


B-2 


Fi 


E-1 5 


U 


IVI-16 


DRDY 


T-13 


DREQT1 


B-3 


F3 


E-1 6 


|7 


M-17 


CDA 


T-1 4 


BINV 


B-4 


Fs 


E-17 


|9 


N-1 


S3 


T-1 5 


l31 


B-5 


F7 


F-1 


S18 


N-2 


Si 


T-16 


I28 


B-6 


F9 


F-2 


820 


N-3 


R30 


T-17 


|2S 


B-7 


Fl3 


F-3 


S22 


N-15 


NO 


U-1 


R25 


B-8 


Fl5 


F-1 5 


Vcc 


N-16 


EXCP 


U-2 


R22 


B-9 


Fl7 


F-16 


ho 


N-17 


DERR 


U-3 


Rl9 


B-10 


Fl9 


F-17 


I12 


P-1 


So 


U-4 


R17 


B-11 


F23 


G-1 


Sl5 


P-2 


R29 


U-5 


Rl5 


B-1 2 


F25 


G-2 


Sl7 


P-3 


R26 


U-6 


Rl4 


B.13 


F26 


G-3 


Sl9 


P-1 5 


\x 


U-7 


R11 


B-14 


F30 


G-1 5 


GND 


P-1 6 


NO 


U-8 


Rg 


B-IS 


GND 


G-1 6 


111 


P-17 


NC 


U-9 


Re 


B-1 6 


MSERR 


G-17 


Il4 


R-1 


R31 


U-10 


R4 


B-1 7 


Is 


H-1 


Sl3 


R-2 


R27 


U-11 


R2 


C-1 


S27 


H-2 


Sl4 


R-3 


R24 


U-1 2 


Ri 


C-2 


S28 


H-3 


S16 


R-4 


R20 


U-13 


OPTo 


C-3 


Fo 


H-1 5 


GND 


R-5 


Vcc 


U-1 4 


OPT2 


C-4 


F2 


H-1 6 


|13 


R-6 


GND 


U-1 5 


R/W 


C-5 


Vcco 


H-1 7 


Il5 


R-7 


R12 


U-16 


OE 


C-6 


GNDO 


J-1 


S11 


R-8 


Rs 


U-1 7 


Iso 


C-7 


F11 


J-2 


S12 


R-9 


GND 






C-8 


GNDO 


J-3 


Vcc 


R-10 


Vcc 




C-9 


Vcco 


J-1 5 


|17 


R-11 


CLK 





Note: Pin Number D-4 = Alignment Pin. 

Vcco and GNDO are power and ground pins for the output buffers. 
Vcc and GND are power and ground pins for the rest of the logic. 
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PGA PIN DESIGNATIONS (sorted by Pin Name) 


Pin No. 


Pin Name 


Pin No. 


Pin Name 


Pin No. 


Pin Name 


Pin No. 


Pin Name 


T-14 


BINV 


G-15 


GND 


B-16 


IVISERR 


P-1 


So 


M-17 


CDA 


H-15 


GND 


N-15 


NC 


N-2 


Si 


R-11 


CLK 


K-3 


GND 


P-16 


NO 


M-3 


S2 


N-17 


DERR 


R-6 


GND 


P-17 


NC 


N-1 


S3 


M-16 


DRDY 


R-9 


GND 


U-16 


OE 


M-2 


S4 


R-14 


DREQ 


C-6 


GNDO 


U-13 


OPTo 


M-^ 


Ss 


R-12 


DREQTo 


C-8 


GNDO 


T-12 


OPT1 


L-3 


Sa 


T-13 


DREQTi 


C-12 


GNDO 


U-14 


OPT2 


L-2 


S7 


N-16 


EXCP 


C-14 


GNDO 


T-11 


Ro 


L-1 


Sa 


C-3 


Fo 


L-15 


GNDO 


U-12 


Ri 


K-1 


S9 


B-2 


Fi 


D-15 


lo 


U-11 


R2 


K-2 


S10 


C-4 


F2 


A-17 


I1 


T-10 


R3 


J-1 


Sn 


B-3 


Fa 


C-16 


I2 


U-10 


R4 


J-2 


S12 


A-2 


F4 


D-16 


I3 


T-9 


Rs 


H-1 


Sl3 


B-4 


Fs 


E-15 


i4 


U-9 


Re 


H-2 


Sl4 


A-3 


Fs 


B-17 


Is 


T-8 


R7 


G-1 


Sis 


B-5 


Ft 


C-17 


l6 


R-8 


Ra 


H-3 


Sis 


A-4 


Fa 


E-16 


I7 


U-8 


R9 


G-2 


Sl7 


B-6 


F» 


D-17 


is 


T-7 


Rio 


F-1 


Sia 


A-5 


F10 


E-17 


I9 


U-7 


R11 


G-3 


Sl9 


C-7 


Fti 


F-16 


ho 


R-7 


R12 


F-2 


S20 


A-6 


F12 


G-16 


111 


T-6 


Rl3 


E-1 


S21 


B-7 


Fl3 


F-17 


Itz 


U-6 


Rl4 


F-3 


S22 


A-7 


Fl4 


H-16 


Il3 


U-5 


Ris 


E-2 


S23 


B-8 


Fl5 


G-17 


lt4 


T-5 


R16 


D-1 


S24 


A-8 


F16 


H-17 


lis 


U-4 


Rl7 


D-2 


S2S 


B-9 


Fl7 


J-16 


I16 


T-4 


Ria 


E-3 


S26 


A-9 


F18 


J-15 


Il7 


U-3 


Rl9 


C-1 


S27 


B-10 


Fl9 


J-17 


lis 


R-4 


R20 


0-2 


S2a 


C-10 


F20 


K-17 


Il9 


T-3 


R21 


D-3 


S29 


A-10 


F21 


K-16 


I20 


U-2 


R22 


B-1 


S30 


A-11 


F22 


K-15 


I2I 


T-2 


R23 


A-1 


Sai 


B-11 


F23 


L-17 


I22 


R-3 


R24 


A-16 


SLAVE 


A-12 


F24 


L-16 


I23 


U-1 


R25 


F-15 


Vcc 


B-12 


F25 


R-17 


l24 


P-3 


R26 


J-3 


Vcc 


B-13 


F26 


T-17 


|2S 


R-2 


R27 


R-5 


Vcc 


A-13 


F27 


P-15 


l26 


T-1 


R28 


R-10 


Vcc 


A-14 


F28 


,R-16 


I27 


P-2 


R29 


C-5 


Vcco 


C-13 


F2« 


T-16 


l28 


N-3 


R30 


0-9 


Vcco 


B-14 


F30 


R-15 


|29 


R-1 


R31 


C-11 


Vcco 


A-15 


F31 


U-17 


l30 


R-13 


RESET 


C-15 


Vcco 


B-15 


GND 


T-15 


l31 


U-1 5 


R/W 


IVI-15 


Vcco 



Note: Pin Number D-4 = Alignment Pin. 

Vcco and GNDO are power and ground pins for the output buffers. 
Vcc and GND are power and ground pins for tfie rest of the logic. 
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CQFP PIN DESIGNATIONS (sorted by Pin No.) 


Pin No. 


Pin Name 


Pin No. 


Pin Name 


Pin No. 


Pin Name 


Pin No. 


Pin Name 


1 


Fo 


42 


Vcc 


83 


I29 


124 


R25 


2 


Fi 


43 


GND 


84 


|28 


125 


R26 


3 


F2 


44 


lo 


85 


l31 


126 


R27 


4 


F3 


45 


I1 


86 


DREQ 


127 


R28 


5 


Fa 


46 


I2 


87 


OE 


128 


R29 




Vcco 


47 


I3 


88 




129 




6 


BINV 


R30 


7 


GNDO 


48 


U 


89 


RESET 


130 


R31 


8 


Fs 


49 


Is 


90 


R/W 


131 


So 


9 


Fe 


50 


Is 


91 


DREQT1 


132 


Si 


10 


F7 


51 


l7 


92 


DREOTo 


133 


S2 


11 


Fa 


52 


Is 


93 


OPT2 


134 


S3 


12 


F9 


53 


I9 


94 


OPT1 


135 


S4 


13 


Fio 


54 


I10 


95 


OPTo 


136 


Ss 


14 


Fii 


55 


In 


96 


CLK 


137 


Se 


15 


Fl2 


56 


I12 


97 


Ro 


138 


S7 


16 


Fl3 


57 


Il3 


98 


Ri 


139 


Ss 


17 


F14 


58 


GND 


99 


R2 


140 


S9 


18 


Fl5 


59 


lu 


100 


R3 


141 


S10 


19 


GNDO 


60 


hs 


101 


R4 


142 


S11 


20 


Vcco 


61 


lie 


102 


Vcc 


143 


GND 


21 


F16 


62 


Il7 


103 


GND 


144 


Vcc 


22 


Fl7 


63 


lis 


104 


Rs 


145 


S12 


23 


Fl8 


64 


|19 


105 


Re 


146 


Sl3 


24 


F19 


65 


I20 


106 


R7 


147 


Sl4 


25 


F20 


66 


I21 


107 


Re 


148 


Sl5 


26 


F21 


67 


I22 


108 


R9 


149 


S16 


27 


FZ2 


68 


l23 


109 


Rio 


150 


Sl7 


28 


F23 


69 


CDA 


110 


R11 


151 


S18 


29 


F24 


70 


DRDY 


111 


Rl2 


152 


Sl9 


30 


F25 


71 


DERR 


112 


Rl3 


153 


S20 


31 


F26 


72 


GNDO 


113 


Rl4 


154 


S21 


32 


Vcco 


73 


Vcco 


114 


Rl5 


155 


S22 


33 


GNDO 


74 


EXCP 


115 


R16 


156 


S23 


34 


F27 


75 


NC 


116 


Rl7 


157 


S24 


35 


F28 


76 


NC 


117 


R18 


158 


S25 


36 


F29 


77 


NC 


118 


Rl9 


159 


S26 


37 


F30 


78 


I24 


119 


R20 


160 


S27 


38 


F31 


79 


l25 


120 


R21 


161 


S28 


39 


GND 


80 


|26 


121 


R22 


162 


S29 


40 


SLAVE 


81 


l27 


122 


R23 


163 


S30 


41 


MSERR 


82 


l30 


123 


R24 


164 


S3I 
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CQFP PIN DESIGNATIONS (sorted by Pin Name) 


Pin No. 


Pin Name 


Pin No. 


Pin Name 


Pin No. 


Pin Name 


Pin No. 


Pin Name 


88 


BiNV 


39 


GND 


41 


MSERR 


130 


Rai 




CDA 


43 


GND 


75 


NC 


40 




69 


SLAVE 


96 


CLK 


58 


GND 


76 


NC 


131 


So 


71 


DERR 


103 


GND 


77 


NC 


132 


Si 


86 


DREQ 


143 


GND 


87 


OE 


133 


S2 






7 


GNDO 


95 


OPTo 


134 




92 


DREQTo 


Sa 


91 


DREQTi 


19 


GNDO 


94 


OPT1 


135 


S4 


70 


DRDY 


33 


GNDO 


93 


OPT2 


136 


Ss 


74 


EXCP 


72 


GNDO 


89 


RESET 


137 


Sa 


1 


Fo 


44 


io 


90 


R/W 


138 


S7 


2 


Fi 


45 


it 


97 


Ro 


139 


Sa 


3 


F2 


46 


l2 


98 


Ri 


140 


Sa 


4 


Fa 


47 


i3 


99 


R2 


141 


S10 


5 


F4 


48 


i4 


100 


Ra 


142 


S11 


8 


Fs 


49 


Is 


101 


R4 


145 


S12 


9 


Fe 


50 


ie 


104 


Rs 


146 


Sl3 


10 


F7 


51 


i7 


105 


Ra 


147 


Sl4 


11 


Fs 


52 


la 


106 


R7 


148 


Sl5 


12 


F» 


53 


U 


107 


Ra 


149 


S16 


13 


Fio 


54 


iio 


108 


Ra 


150 


Sl7 


14 


Fii 


55 


ill 


109 


Rio 


151 


S18 


15 


Fl2 


56 


I12 


110 


R11 


152 


Sl9 


16 


Fl3 


57 


iia 


111 


R12 


153 


S20 


17 


Fl4 


59 


iu 


112 


Rl3 


154 


S21 


18 


Fis 


60 


lis 


113 


Rl4 


155 


S22 


21 


Fl8 


61 


I16 


114 


Ris 


156 


S23 


22 


Fl7 


62 


il7 


115 


R16 


157 


S24 


23 


Fie 


63 


Iia 


116 


Rl7 


158 


S2S 


24 


Fi» 


64 


il9 


117 


Ria 


159 


S26 


25 


Fm 


65 


izo 


118 


Rl9 


160 


S27 


26 


F21 


66 


I21 


119 


R20 


161 


S28 


27 


F22 


67 


122 


120 


R21 


162 


S29 


28 


F23 


68 


I23 


121 


R22 


163 


S30 


29 


F24 


78 


i24 


122 


R23 


164 


S31 


30 


F25 


79 


|2S 


123 


R24 


42 


Vcc 


31 


F26 


80 


128 


124 


R25 


102 


Vcc 


34 


F27 


81 


i27 


125 


R26 


144 


Vcc 


35 


F28 


84 


12a 


126 


R27 


6 


Vcco 


36 


F29 


83 


i29 


127 


R2a 


20 


Vcco 


37 


Fao 


82 


iao 


128 


R29 


32 


Vcco 


38 


F31 


85 


iai 


129 


Rao 


73 


Vcco 
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Am29027 



Transaction 
Request 



1^ 






RESET 

Ryw 



DREQ 
DREQT,-DREQTo 

OPTj-OPTo 
BiNV 
R31— Ro 

S31— So 
I31— In 



> OE 



SLAVE 
CLK 



CDA 



DRDY 



DERR 



F31— Fo 



MSERR 



EXCP 



> Transaction 
Status 



3^ 



O9114B-O02C 
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ORDERING INFORMATION 
Standard Products 

AMD standard products are available in several packages and operating ranges. The ordering number 
(Valid Combination) is formed by a combinatbn of: a. Device Number 

b. Speed Option (if applicable) 

c. Package Type 

d. Temperature Range 

e. Optional Processing 



AM29027 



-25 



B 

I. 



a. DEVICE NUMBERyDESCRIPTION 

Am29027 
Arithmetic Accelerator 



e. OPTIONAL PROCESSING 

Blank = Standard Processing 
B » Burn-in 

d. TEMPERATURE RANGE 

= Commercial ( to +85°C) 

c. PACKAGE TYPE 

G - 169-Lead Pin Grid Array without Heatsink 
(CGX169) 



b. SPEED OPTION 

-25 =25 MHz 
-20 - 20 MHz 
-16 = 16 MHz 



Valid Combinations 


AM29027-25 


GO. GOB 


AM29027-20 


AM29027-16 



Valid Combinations 

Valid Combinations list configurations planned to 
be supported in volume for this device. Consult 
the local AMD sales office to confirm availability of 
specific valid combinations, to check on newly 
released combinations, and to obtain additional 
data on AMD's standard military grade products. 
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MILITARY ORDERING INFORMATION 
APL Products 



AMD products for Aerospace and Defense applications are available in several packages and operating ranges. APL 
(Approved Products List) products are fully compliant with MIL-STD-883C requirements. The order number (Valid Combina- 
tion) is formed by a combination of a. Device Number 

b. Speed Option (if applicable) 

c. Device Class 

d. Pacltage Type 

e. Lead Finish 



AM2g027 -20 



/B 



"L 



a. DEVICE NUMBER/DESCRIPTION 

Am29027 
Arithmetic Accelerator 



e. LEAD FINISH 

C - Gold 



d. PACKAGE TYPE 

Z = 169-Lead Pin Grid Array without Heatsink 

(CGX169) 
Y = 164-Lead Ceramic Quad Flat Pack without Heatsink 



c. DEVICE CLASS 

/B = Class B 

b. SPEED OPTION 

-20 = 20 MHz 
-16 = 16 MHz 



Valid Combinations 


AM29027-20 


/BZC, /BYC 


AM29027-16 



Valid Combinations 

Valid Combinations list configurations planned to 
be supported in volume for this device. Consult 
the local AMD sales office to confirm availability of 
specific valid combinations or to check on newly 
released valid combinations. 



Group A Tests 

Group A tests consist of Subgroups 
1.2,3.7,8,9.10.11. 
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PIN D ESCRtPTION 

BINV 

Bus invaiid (Synchronous Input) 

A logic Low indicates tiiat the Am29000 address bus 
and related control signals are i nvalid . The Am29027 
will ignore signal DREQTi when BINV is Low. 



CDA 



Coprocessor Data Accept (Three-State Output) 

A logic Low indicates that the Am29027 is ready to ac- 
cept data from the Am29000. This signal is normally 
driven by the Am29027, and assumes a high-imped- 
ance st ate only if input signal OE is High or input signal 
SLAVE is Low. 

CLK 

Clock (Input) 



DERR 

Data Error (Three-State Output) 

A logic Low indicates that an unmasked exception oc- 
curred during or preceding the current transaction re- 
quest. This signal is nomially driven by the Am29027, 
and assumes a high-impe dance st ate only if input signal 
OE is High or input signal SLAVE is Low. 



DRDY 

Data Ready (Three-State Output) 

A logic Low indicates that data is available on Port F. 
This signal is normally driven by the Am29027, and as- 
sumes a high-imped ance st ate only if input signal OE is 
High or input signal SLAVE is Low. 



DREQ 

Data Request (Synchronous Input) 

A logic Low indicates that the Am29000 is making a data 
access . The Am29027 will ignore signal DREQTi when 
DREQ is High. 

DREQTo 

Start Instruction/Suppress Errors 
(Synchronous Input) 

This signal, when accompanied by a valid write operand 
R, write operand S, write operands R, S, or write instruc- 
tion transaction request, commands the Am29027 to 
begin a new operation. When accompanying a valid 
read result LSBs, read result MSBs, read flags, or read 
status transaction request, DREQTo suppresses the re- 
porting of operation errors. DREQTo also modifies the 
action of the write status transaction request to retime 
an operation in flow-through mode, or to invalidate the 
ALU pipeline in pipeline mode. 

DREQTi 

Accelerator Transaction Request 

(Synchronous Input) 

A logic High indicates that the Am29000 is making an 
accelerator transaction request. This signal is consid- 



ered valid only when signal BINV is High and signal 
DREQ is Low. 



EXCP 

Exception (Three-State Output) 

Indicates that the status register contains one or more 
unmasked exception bits. This signal can be used as 
an interrupt or trap signal by the Am29000. EXCP is 
normally driven by the Am29027, and assumes a high- 
imped ance st ate only if input signal OE is High or input 
signal SLAVE is Low. 

F31— Fo 

F Output Bus (Three-State Output) 

I31— lo 

Instruction Bus (Synchronous Input) 

Used to specify the operation to be perfomied by the 
accelerator. 

MSERR 

Master/Slave Error (Output) 

Reports the result of the comparison of processor out- 
puts with the signals provided internally to the off-chip 
drivers. If there is a difference for any enabled driver, 
MSERR assumes the logic High state. 

OE 

Output Enable (Asynchronous Input) 

A logic High forces all accelerator outputs except 
MSERR to assume a high-impedance state uncondi- 
tionally; master/slave comparison circuitry is also dis- 
abled. This signal is provided for test purposes. 

OPT2-OPT0 

Transaction Type (Synchronous Input) 

These signals, in conjunction with R/W, specify the type 
of accelerator transaction, if any, currently being re- 
quested by the Am29000. 

R31— Ro 

R Data Bus (Synchronous input) 



RESET 

Reset (Asynchronous Input) 



Resets the Am29027. When RESET is a logic Low, the 
state of internal sequencing circuitr y is initialized, and 
the status register is cleared. RESET must be connected 
to the signal line used to reset the Am29000. 

R/W 

Read/Write (Synchronous input) 

Determines the direction of a transaction. When R/W is 
High, data is transferred from the Am29027 to the 
Am29000; when RAA/ is Low, data is transferred from the 
Am29000 to the Am29027. 
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Sai— So 

S Data Bus (Synchronous Input) 



SLAVE 

Master/Slave Mode Select 
(Synchronous Input) 

A logic Low selects Slave mode; in this mode all outputs 
except MSERR assume a high-impedance state. A logic 
High selects Master mode. 

FUNCTIONAL DESCRIPTION 
Overview 

The Am29027 Is a high-performance, single-chip arith- 
metic accelerator for the Am29000 Streamlined Instruc- 
tion Processor. 

Architecture 

The Am29027 comprises a high-speed ALU, a 64-bit 
data path, and control circuitry. 

The core of the Am29027 is a 64-bit floating-point/inte- 
ger ALU. The ALU takes operands from three 64-bit 
input ports and performs the selected operation, placing 
the result on a 64-bit output port. Seven ALU flags report 
operation status. The ALU is completely combinatorial 
for minimum latency; optional pipelining is available to 
boost throughput for vector operations. 

The data path consists of two 32-bit input buses, R and 
S; two 64-bit input registers; two 64-bit temporary input 
registers; a 64-bit result register; an 8-word-by-64-bit 
register file for storage of intermediate results; three op- 
erand selection multiplexers that provide for orthogonal 
selection of input operands; an output multiplexer that 
selects data, operation flags, operation status, or other 
accelerator state; and a 32-bit output bus, F. Input oper- 
ands enter the floating-point accelerator through the R 
and S buses, and are then demultiplexed and buffered 
for subsequent storage in the input registers. The oper- 
and selection multiplexers route the operands to the 
ALU; operation results and status leave the device on 
Output Bus F. Operation results also can be stored in 
the register file for use in subsequent operations. 

On-board control circuitry sequences the ALU and data 
path during operations, and manages the transfer of 
data between the accelerator and the Am29000. A 
32-bit instnjction register and a 32-bit temporary in- 
struction register hold the instruction words for current 
and pending operations. 

Instruction Set 

The Am29027 implements 57 arithmetic and logical in- 
structions. Thirty-five instructions operate on floating- 
point numbers; these instructions fall into the following 
categories: 

■ addition/subtraction 

■ multiplication 



Am29027 

■ multiplication-accumulation 

■ comparison 

■ selecting the maximum or minimum of two numbers 

■ rounding to integral value 

■ absolute value, negation, pass 

■ reciprocal seed generation 

■ conversion between any of the supported 
floating-point formats, including conversions 
between precisions 

■ conversion of a floating-point number to an integer 
format, with an optional scale factor 

By concatenating these operations, the user can also 
perform division, square-root extraction, polynomial 
evaluation, and other functions not implemented 
directly. 

Twenty-two instructions operate on integers, and be- 
long to the following general categories; 

■ addition/subtraction 

■ multiplication 

■ comparison 

■ selecting the maximum or minimum of two numbers 

■ absolute value, negation, pass 

■ logical operations, e.g., AND, OR, XOR, NOT 

■ arithmetic, logical, and funnel shifts 

■ conversion between single- and double-precision 
integer formats 

■ conversion of an integer number to a floating-point 
format, with an optional scale factor 

■ pass operand 

One special instruction is provided to move data. 

Performance 

The Am29027 provides operation speeds several times 
greater than conventional floating-point processors 
by virtue of its extensive use of combinatorial, rather 
than sequential, logic. Most floating-point operations, 
whether single, double, or mixed precision, can be 
performed in as few as six system clock cycles. Perfor- 
mance is further enhanced by the presence of the 
on-board registerfile that can be used to hold intermedi- 
ate results, thus reducing the amount of time needed to 
transfer operands between the Am29027 and the 
Am29000. The input operand registers and the instruc- 
tion register are double-buffered, so that a new opera- 
tion can be specified while the current operation is be ing 
completed. 

Interface 

The Am29027 connects directly to the Am29000 system 
buses. Am29027 operations are specified by a series of 
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operand and instruction transactions issued by tlie 
Am29000. Eigiit input signals specify tiie transaction to 
be performed; tiiree output signais report transaction 
status. 

Master/Slave 

The Am29027 contains special comparison hardware to 
allow the operation of two accelerators in parallel, with 
one accelerator (the slave) checking the results pro- 
duced by the other (the master). This feature is of 
particular importance in the design of high-reliability 
systems. 

Support 

The Am29027 IEEE format is fully supported by those 
hardware and software tools available for the Am29000, 
including: 

■ HighC29K Cross-Development Toolkit 



■ AS1VI29K Cross-Development Toolkit 

■ ADAPT29K, a general-purpose hardware develop- 
ment system. The ADAPT29K permits single-step 
operation, break-point insertion, and other standard 
debugging techniques. 

Block Diagram Description 

A block diagram of the Am29027 is shown in Figure 1 . 
The Am29027 comprises the input registers, the oper- 
and selection multiplexers, the instruction register, the 
ALU, the output register/register file, the flag register, 
the status register, the output multiplexer, the mode reg- 
ister, the control unit, and the master/slave comparator. 




SLAVE 



MSERR F3,-Fo 

Figure 1. Am29027 Block Diagram 
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Input Registers 

Operands are loaded into tfie accelerator via the 32-bit 
R and S buses, and are demultiplexed and buffered for 
subsequent storage in 64-bit registers R and S; input op- 
erands may be eithersingle-precision (32-bit) or double- 
precision (64-bit). Two single-precision or one double- 
precision operand may be written to the input registers 
in a single system clock cycle. Accompanying the input 
registers are two 64-bit temporary registers, R-Temp 
and S-Temp, that permit the overlapping of operand 
transfers and ALU operations. 

Operand Selection Multiplexers 

The operand selection multiplexers route operands 
to the ALU. These multiplexers, as well as selecting 
operands from input registers R and S and register file 
locations RF7-RF0, also have access to a set of floating- 
point and integer constants. These constants are 
double-precision preprogrammed numbers for use in 
ALU operations, and are automatically provided in the 
appropriate format. 

Instruction Register 

The instruction register stores a 32-bit word specifying 
the current accelerator operation. Included in the in- 
struction word are fields that specify the core operation 
to be performed by the ALU, operand format (integer or 
floating-point), sign-change selects for ALU input and 
result operands, operand precisions, operand sources, 
and register file controls. The instruction register is 
preceded by the 32-bit temporary register, l-Temp, per- 
mitting the overlapping of instruction transfers and ALU 
operations. Instmctions enter the accelerator via 32-bit 
Instmction Bus 1. 

ALU 

The ALU is a combinatorial arithmetic/logic unit that 
performs a large repertoire of floating-point and integer 
operations. The ALU has three operand inputs. Some 
operations require a single input operand, for example, 
conversion operations. Others, such as addition or mul- 
tiplication, require two input operands. The multiplica- 
tion-accumulation and funnel shift operations require 
three input operands. Most ALU operations allow the 
user to modify operand signs, thus greatly increasing 
the number of arithmetic expressions that can be evalu- 
ated in a single ALU pass. 

The ALU can be configured in either flow-through mode, 
for which the ALU is completely combinatorial, or pipe- 
line mode, for which ALU operations are divided into one 
or two pipeline stages. 

Output Register/Register File 

Operation results are stored in 64-bit output register F; 
results can also be stored in the 8-by-64-bit register 
file for use in subsequent operations. A precision regis- 
ter, part of the register file, contains bits indicating the 
precisions of the operands stored in each register file , 
location, thus permitting the ALU to correctly process 
these operands in later operations. 
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Flag Register ~ 

The 32-bit flag register stores flags pertaining to the 
most recently perfomried operation. The flags indicate 
error conditions, such as underflow or overflow, and 
also report results for operations that produce result 
flags, such as comparisons. 

Status Register 

The 32-bit status register contains information regard- 
ing the status of past, current, and pending operations. 

Six exception bits report operation error conditions. 
These exception bits are individually latched; once a 
given bit is set, it remains set until reset by the Am29000 
or by system reset. The exception bits indicate error 
conditions of overflow, underflow, zero result, reserved 
operand, invalid operation, and inexact result. At the us- 
er's option, the presence of an exception can be used to 
report a data error to the Am29000, or to halt Am29027 
operation; exception bits can be individually enabled or 
disabled by programming the corresponding mask bit in 
the mode register. 

Exception bit activity is summarized by a seventh bit. 
Exception Status, which indicates that one or more un- 
masked status bits are s et. If d esired, the state of this bit 
can be placed on signal EXCP, which can be used to in- 
terrupt the Am29000. 

The status register contains four additional bits — 
R-Temp Valid, S-Temp Valid, l-Temp Valid, and Opera- 
tion Pending—that pertain to the state of pending oper- 
ands and operations. 

Output Multiplexer 

The output multiplexer routes operation results and ac- 
celerator's internal state to the Am29000 through the 
32-bit F bus. This multiplexer can select Register F, the 
flag register, status register, instruction register, mode 
register, or precision register. 

Mode Register 

The 64-bit nwde register contains accelerator control 
parameters that change infrequently or not at all, such 
as floating-point format, round mode, and operation 
timing information. These parameters are initialized by 
the Am29000 during system start-up, and are modified 
as required during operation. 

Control Unit 

The control unit manages the transfer of data between 
the Am29000 and the Am29027, as well as the timing of 
operation execution. The Am29000 oversees operation 
of the Am29027 by issuing one of thirteen commands, or 
transaction requests, to the control unit via eight signal 
lines. Each transaction request specifies an action on 
the part of the Am29027, such as writing an operand to 
an input register or returning a result to the Am29000. 
The control unit interprets the transaction request and 
sequences the Am29027 to produce the desired action. 
Three transaction status lines are generated by the con- 
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trol unit to indicate transaction completion, orto indicate 
tiie existence of an accelerator error condition. 

Master/Siave Comparator 

Each Am29027 output signal has associated logic that 
compares that signal with the signal that the accelerator 
provides internally to the output driver; any discrepan- 
cies are indicated by assertion of signal MSERR. 

For a single accelerator, this output comparison delects 
short circuits in output signals or defective output driv- 
ers, but does not detect open circuits. It is possible to 
connect a second accelerator in parallel with the first, 
with the secon d accel erator's outputs disabled by asser- 
tion of signal SLAVE. The second accelerator detects 
open-circuit signals, and provides a check of the outputs 
of the first accelerator. 

System Interface 

Am29000/Am29027 signal interconnects are depicted 
in Figure 2. 

Three Am29027 buses — R31-R0, bi-lo, and F31-F0 — are 
connected to Am29000 Data Bus D31-D0; the remaining 
Am29027 bus, Ssi-So, is connected to Am29000 Ad- 



dress Bus A31-A0. Through these connections, the 
Am29000 can transfer to the Am29027 a 32-bit instruc- 
tion, two 32-bit operands, or a 64-bit operand in a single 
cycle, or can receive a 32 -bit result from the Am29027 in 
a single cycle. 

Twelve additional signals govern communication be- 
tween the Am2900 and A m29027. Eight Am29000 out- 
put signals^R/W, DREQ, DREQTi, DREQTo, OPT?- 
OPTo, and BINV— are connected to the corresponding 
Am29027 signals and are used to issue transaction 
reque sts to the A m290 27. Th ree Am29027 sig- 
nals— CDA^ DRDY, and DERR— report transaction 
status. CDA is directly conn ected to the corres ponding 
input of the Am29000, while DRDY and DERR must be 
ORed with like sig nals fr om other resources. A fourth 
Am29027 signal, EXCP, may be connected to an 
Am29000 trap or interrupt input to signal the presence of 
Am29027 operation exceptions at the user's option. 

The Am29027 takes its clock input from the Am29000 
SYSCLK system clock output. 

The signal used to reset t he Am2 9000 must also be 
connected to the Am29027 RESET input. 
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Figure 2. Am29000/Am29027 Hardware Interface 
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Special-Purpose Registers 

The Am29027 contains six special-purpose registers: 
the mode register, status register, flag register, preci- 
sion register, instruction register, and l-Temp register. 

Mode Register 

The 64-bit mode register stores 24 infrequently changed 
parameters pertaining to accelerator operation; its for- 
mat is shown in Figure 3. The Am29000 rrwdif ies the ac- 
celerator parameter set by issuing a write mode register 
transaction request. 

The mode register should be initialized after hardware 
reset, and may be written with new parameters when a 
new nnode of accelerator operation is required; mode 
changes take effect immediately. The Am29027 does 
not alter the contents of the mode register in the course 
of operation. 

Bits 63-47— Reserved for future use. This field must 
be set to to assure future compatibility. 



Bit 4&— EXCP Enable (EX): When E X is H igh, report- 
ing of unmasked excepti ons via signal EXCP is enabled. 
When EX is Low, signal EXCP is forced inactive (logic 
High). 

Bit 45— Halt On Error Enable (HE): When HE is High, 
the Am29027 will halt operation in the presence of an 
unmasked exception. 



Bit 44— Advance DRDY (AD): When AD is High, signal 
DRDY is advanced one cycle in flow-through mode. This 
bit has no effect in pipeline mode. 

Bits 43-40— Timer Count for the MOVE P Operation 
(MVTC): In flow-through mode, MVTC specifies the 
number of clock cycles needed for data to traverse the 
ALU for base operation code MOVE P; in pipeline mode, 
it has no effect. This field can assume values between 3 
and 15, inclusive. 

Bits 39-36 — ^Timer Count for the Multiply-Accumu- 
late Operation (MATC): In flow-through mode, 
MATC specifies the number of clock cycles needed for 
data to traverse the ALU for base operation code 
F' = (P'x QO + T'; in pipeline mode, it has no effect. This 
field can assume values between 3 and 15, inclusive. 

Bits 35-32— Pipeline Timer Count (PLTC): In flow- 
through mode, PLTC specifies the number of clock cy- 
cles needed for data to traverse the ALU for any base 
operation code except F = (P' x QO + T' or MOVE P; in 
pipeline mode, it specifies the number of cycles needed 
for data to traverse a single pipeline stage for any base 
operation code. This field can assume values between 3 
and 15, inclusive, in flow-through mode, and between 2 
and 15, inclusive, in pipeline mode. 

Bits 31-28— Reserved for future use. This field must 
be set to to assure future compatibility. 

Bit 27— Zero Result Exception Mask (ZMSK): When 
ZMSK is High, the status register zero result exception 
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bit is masked and will not contribute to the detection of 
an error condition. 

Bit 26— inexact Result Exception Mask (XMSK): 
When XMSK is High, the status register inexact result 
exception bit is masked and will not contribute to the de- 
tection of an error condition. 

Bit 25— Underflow Exception Mask (UMSK): When 
UMSK is High, the status register underflow exception 
bit is masked and will not contribute to the detection of 
an error condition. 

Bit 24— Overflow Exception Mask (VMSK): When 
VMSK is High, the status register overflow exception bit 
is masked and will not contribute to the detection of an 
error condition. 

Bit 23— Reserved Operand Exception Mask (RMSK): 

When RMSK is High, the status register reserved oper- 
and exception bit is masked and will not contribute to the 
detection of an error condition. 

Bit 22— Invalid Operation Exception Mask (IMSK): 

When IMSK is High, the status register invalid operation 
exception bit is masked and will not contribute to the 
detection of an error condition. 

Bit 21 — Reserved for future use. This bit must be set 
to to assure future compatibility. 

Bit 20— Pipeline Mode Select (PL): When PL is High, 
pipeline mode is selected; when PL is Low, flow-through 
(unpipelined) mode is selected. 

Bits 19-17— Reserved for future use. This field must 
be set to to assure future compatibility. 

Bits 15-14— Round Mode Select (RMS): Selects one 
of six rounding modes as follows: 



RMS 


Round Mode 


000 


Round to nearest (IEEE) 


001 


Round to minus infinity 


01 


Round to plus infinity 


01 1 


Round to zero 


100 


Round to nearest (DEC) 


1 01 


Round away from zero 


1 1 X 


Illegal value 



Additional information on round modes can be found in 
Appendix B. 

Bits 13-12— Integer Multiplication Format Adjust 
(MF): Selects the output format for integer multiplica- 
tion. The user may select eitherthe MSBsorthe LSBsof 
an integer multiplication result, with optional format 
adjust. MF is encoded as follows: 



MF 



Output Format 



00 


LSBs 


01 


LSBs, format-adjusted 


1 


MSBs 


1 1 


MSBs, format-adjusted 
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"Format-adjusted" indicates ttiat ttie product is shifted 
left one place before the MSBs or LSBs are selected. 

Bit 11— Integer Multiplication Signed/Unsigned 
Select (MS): If MS is High, input operands for integer 
multiplication operations are treated as two's comple- 
ment numbers. If MS is Low, the input operands are 
treated as unsigned numbers. 

Bit 10— Reserved for future use. This bit must be set 
to to assure future compatibility. 

Bit 9— IBM Underflow Mask Enable (BU): If BU is 

High, certain underf lowed IBM operations will produce a 
normalized result with a biased exponent increased by 
128. If BU is Low, these operations will produce a final 
result of true zero. BU affects only those operations that 
produce a result in IBM format and that use the following 
base operation codes: 



F' = P' + T' 
F'=P'xQ' 
Compare P, T 
F' = (P' X Q') + r 



Convert T to Alternate F.P. Format 
Convert T from Alternate F.P. 

Format 
Scale Tto Floating-point by Q 



Bit 8— IBM Significance Mask Enable (BS): If BS is 

High, certain IBM operations having intemiediate re- 
sults of will produce a final result of with the 
biased exponent unchanged. If BS is Low, these opera- 
tions will produce a final result of true zero. BS affects 
only those operations that produce a result in IBM 
format and that use the F' = P' + Q'and COMPARE P, T 
base operation codes. 

Bit 7— IEEE Sudden Underflow Enable (SU): If SU is 

High, all IEEE denormalized results are replaced by a 
of the same sign; if SU is Low, the appropriate denor- 
malized number will be produced. If IEEE traps are en- 
abled (mode register bit TRP High), sudden underflow is 
disabled. 

Bit 6— IEEE Trap Enable (TRP): If TRP is High, IEEE 
trapped operation is enabled; the Saturate Enable 
(SAT) and Sudden Underflow (SU) bits are ignored. For 
an underf lowed result, the biased exponent is increased 
by 192 (single precision) or 1536 (double precision), 
with the significand unchanged. For an overflowed re- 
sult, the biased exponent is decreased by a like amount 



with the significand unchanged. If TRP is Low, IEEE 
trapped operation is disabled. This bit affects only those 
operations that produce a result in IEEE floating-point 
format. 

Bit 5— IEEE Afflne/Projectlve Select (AP): If AP is 
High, IEEE addition or subtraction operations having 
infinite input operands are performed in affine mode; if 
AP is Low, these operations are performed in projective 
mode. In affine mode, it is permissible to add infinities of 
like sign or subtract infinities of opposite sign, producing 
an infinite result with the appropriate sign. In projective 
mode these operations will produce an invalid operation 
exception. This bit affects only those operations that 
produce a result in IEEE floating-point format. 

Bit 4— Saturate Enable (SAT): If SAT is High, over- 
flowed results are replaced by the largest representabie 
value in the selected format of the same sign as the 
overflowed result; if SAT is Low, the result produced de- 
pends on the overflow conventions for the selected 
floating-point format. If IEEE traps are enabled (mode 
register bit TR High), saturation is disabled for any 
operation that produces a result in IEEE floating-point 
format. 

Bits 1-0 Primary Floating-Point Format (PFF), 
Bits 3-2 Alternate Floating-Point Format (AFF): The 

primary format is used as the source and destination for- 
mat for all floating-point operations except conversions, 
and as the source or destination format for operations 
that convert between floating-point and integer formats. 
The alternate format is used as a source or destination 
format in operations that convert one floating-point 
format to another. Both the PFF and AFF fields are en- 
coded as follows: 



High 
Bit 



Low 
Bit 



Format 









IEEE 





1 


DEC F (Single), DEC D (Double) 


1 





DEC F (Single), DEC G (Double) 


1 


1 


IBM 



Floating-point formats are discussed in further detail in 
Appendix A. 
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Figure 3. Mode Register 
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status Register 

The status register contains operation exception status, 
as well as the status of pending operands and opera- 
tions; its forniat is shown in Figure 4. The Am29000 can 
initialize or modify the contents of the status register by 
issuing a write status transaction request, and can read 
current status register contents by issuing a read status 
transaction request or as part of a save state sequence. 

All status register bits are initialized to a logic Low after 
hardware reset. 
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Figure 4. Status Register 



Bits 31-1 1— Reserved for future use. This field must 
be set to when written to assure future compatibility. 

Bit 10— Operation Pending (OPP): A logic High indi- 
cates that an operation awaits execution. 

Bit 9— l-Temp Valid (IV A): A logic High indicates that 
register l-Temp contains an instmction for a pending 
operation. 

Bit 8— S-Temp Valid (SVA): A logic High indicates that 
register S-Temp contains an operand for a pending 
operation. 

Bit 7— R-Temp Valid (RVA): A logic High indicates that 
register R-Temp contains an operand for a pending 
operation. 

Bit 6— Exception Status (ES): A logic High indicates 
that status register bits 0-5 contain an unmasked 
exception. 

Bit 5— Zero Result Flag (ZEX): A logic High indicates 
that an operation produced a zero result. Latches until 
cleared. 

Bit 4— inexact Result Bit (XEX): A logic High indicates 
that an operation result had to be rounded to fit the desti- 
nation format. Latches until cleared. 



__^ Am29027 

Bit 3— Underflow Exception Bit (UEX): A logic High 
indicates that an operation result has underflowed the 
destination format. Latches until cleared. 

Bit 2— Overflow Exception Bit (VEX): A logic High in- 
dicates that an operation result overflowed the destina- 
tion format. Latches until cleared. 

Bit 1— Reserved Operand Exception Bit (REX): A 

logic High indicates that a reserved operand appeared 
as an input operand to an operation or was generated as 
a result. Latches until cleared. 

Bit 0— Invalid Operation Exception Bit (lEX): A logic 
High indicates that input operands are unsuitable forthe 
operation perfomned (e.g., «>x 0). Latches until cleared. 

Flag Register 

The flag register contains 7 flag bits that report excep- 
tion or Boolean results for the most recently performed 
operation; its format is shown in Figure 5. The remaining 
25 register bits are reserved for future use. The 
Am29000 can read the current flag register contents by 
issuing a read flags transaction request. 

Flag register bits 6-0 correspond to Flag 6-Flag 
(FLe-FLo). 

These flags assume a meaning that is operation-de- 
pendent, as discussed in the Operation Flags section. 

The flag register is made transparent in flow-through 
mode. 
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Figure 5. Flag Register 
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Precision Register 

The precision register contains 8 bits that report the pre- 
cision of operands stored in the register file; its format is 
shown in Figure 6. Bit (PRo) reports the precision of 
register file location (RFo), bit 1 the precision of loca- 
tion 1 (RFi), and so on. A logic High indicates a single- 
precision value, logic Low a double-precision value. 

The precision register also contains the Accelerator Re- 
lease Level (ARL), an 8-bit, read-only identification 
number that specifies the accelerator version. The ARL 
field occupies bits 31-24. 

The remaining 16 bits of the precision word are reserved 
for future use, and must be set to when written to as- 
sure future compatibility. 
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Figure 6. Precision Register 



instruction Register, i-Temp Register 

The instruction register contains a 32-bit instruction 
word that specifies the ALU operation; its format is 
shown in Figure 7. 



31 30 28 2724 23201916 1514 1312 1110 9 8 76 5 40 


R 


R 


P 


Q 


T 


1 


R 


8 


S 


S 


S 


1 


C 


F 


F 


M 


M 


M 


P 


P 


1 


1 


1 


1 


F 





S 


S 


S 


S 


R 


R 


P 


Q 


T 


F 






09114-009A 


Figure 7. Instruction Register 



Bit 31— Register File Enable (RF): Enables a write to 
the register file. When RF is High, the operation result is 
written to the register file location specified by RFS and 
the resulting precision is written to the corresponding bit 
of the precision register. When RF is Low, no write 
is performed either to the register file or the precision 
register. 

Bits 30-28— Register file select (RFS): Selects the 
register file location (RFt-RFo) to which the operation 
result is to be written. If bit RF is Low, the value of RFS is 
a "don't care." 

Bits 27-24— Select for P Operand iVIultlplexer 
{PMSy. Selects the data input for the ALU P port. 

Bits 23-20— Select for Q Operand Multiplexer 
(QMS): Selects the data input for the ALU Q port. 



Bits 1 9-1 &— Select for T Operand Multiplexer (IMS): 

Selects the data input for the ALU T port. 

Bit 15— input Precision (iPR): Precision of the oper- 
ands in Registers R and S; single precision when High, 
double precision when Low. 

Bit 14— Result Precision (RPR): Precision of the ALU 
output; single precision when High, double precision 
when Low. 

Bits 13-12— Sign P (SIP): Sign-change control for the 
ALU P input. 

Bits 11-10— Sign (SIQ): Sign-change control for the 
ALUQinpuL 

Bits 9-8— Sign T (SIT): Sign-change control for the 
ALU T input. 

Bits 7-6— Sign F (SiF): Sign-change control for the 
ALU output. 

Bit 5— Integer/Floating-point Select (IF): A logic Low 
selects a floating-point operation, a logic High an integer 
operation. 

Bits 4-0— Core Operation (CO): Specifies the core op- 
eration to be performed by the ALU. 

The function of the instruction word fields is discussed in 
further detail in the Accelerator Instruction Set section. 

The I-Temp register has a format identical to that of the 
instruction register; this register is used to temporarily 
buffer instructions for pending operations, thus allowing 
the overlap of operation specification and execution. 

The Am29000 can write to the instruction and I-Temp 
registers by issuing the write instruction transaction 
request, and can read the contents of these registers as 
part of the save state sequence. 

Operand Registers 

The Am29027 holds operands in thirteen 64-bit regis- 
ters. Four registers— R, S, R-Temp, and S-Temp — 
store ALU input operands; a fifth register, F, stores ALU 
results. Eight remaining registers, RF7-RF0, are ar- 
ranged as a file into which operation results can be 
written, and from which operands can be taken for use in 
subsequent operations. 

All operand registers share common data formats; any 
register can hold a single- or double-precision floating- 
point number, or a single- or double-precision integer. 

Floating-point numbers are stored with the sign bit in the 
most significant bit (bit 63) of the operand register. For 
single-precision numbers, the 32 LSBs of the register 
are unused; the value of these unused bits is a "don't 
care." 

Integer numbers are stored with the least significant bit 
placed in the least significant bit (bit 0) of the operand 
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register. For single-precision numbers, the 32 MSBs of 
the register are unused; the value of these unused bits is 
a "don't care." Floating-point and integer formats are de- 
scribed in further detail in Appendix A. 

Accelerator Transaction Requests 

The Am29000 controls the Am29027 with 13 transac- 
tion requests. Transaction request type is indicated by 
the state of four signals: RA/V and OPT2-OPTo. Table 1 
lists the transaction types and corresponding signal 
states. 

Transaction requests are conditioned by signal 
DREQTi (which when High indic ates a n acce lerator 
transaction) and signals BINV and DREQ. The 
Am29027 wil l reco gnize a trans action request only if 
DREQTi and BINV are High and DREQ is Low. 

Signal DREQTo modifies the execution of most transac- 
tion requests. For transaction requests that transfer 
operands or instructions to the Am29027, asserting 
DREQTo will start the execution of an accelerator 
operation. For transaction requests that transfer opera- 
tion results, status, or flags to the Am29000, asserting 
DREQTo will suppr ess the reporting of unmasked 
exceptions via signal DERR. For the write status trans- 
action request, asserting DREQToeitherretimesthe op- 
eration currently described by the instaiction register 
(flow-through mode) or invalidates the ALU pipeline 
(pipeline mode). 

Write Transaction Requests 

Write transactions transfer data from the Am29000 to 
the Am29027, or cause the Am29027 to transfer data 
internally. To perform a write request, the Am29000: 

■ Issues the appropriate transaction request on 
signals OPT2-OPT0, and asserts signal RA/V Low 

■ Places the data to be transferred, if any, on output 
signals D31-D0 and A31-A0 

The Am29027 responds to the request by asserting one 
(and only one) of two status signals: 

■ CDA indicates that the Am29027 will take the 
specified action and clock in the data accom- 
panying the transaction request, if any, on the next 
rising edge of clock. 

■ DERR indicates that the Am29027 is unable to 
accept the data, due to the presence of an 
unmasked exception. 

Timing for write transactions is illustrated in Appendix D. 



Table 1. Transaction Requests 



R/W 


0PT2 


OPT, 


OPTo 


Request Type 






















Write Operand R 
Write Operand S 
Write Operands R, S 
Write Mode 





1 








Write Status 





1 







Write RF Precisions 





1 







Write Instruction 





1 






Advance Temp Registers 













Read Results MSBs 












Read Results LSBs 













Read Flags 
Read Status 




1 








Save State 



There are eight write transactions: 

Write Operand R: An operand is written to Input Regis- 
ter R and/or R-Temp. The most significant half of the 
64-bit operand to be written is placed on Input Bus R, the 
least significant half on Input Bus S. The action taken 
depends on signal DREQTo and on whether an accel- 
erator operation will be in progress during the next clock 
cycle. 



Operation 
In progress Data 

DREQTo next written 

asserted clock cycle to 



Operation 
R-Temp pending 
valid bit bit 



No 
Yes 
Yes 



X 

No 
Yes 



R-Temp 

R-Temp. R 

R-Temp 



Set 

Reset 

Set 



Unchanged 

Reset 

Set 



If DREQTo is asserted and no accelerator operation will 
be in progress during the next clock cycle, a new opera- 
tion will be started on the next rising edge of CLK. 

If mode register bit HE (Halt On Error Enable) is High 
and an unmasked exception has been detected, the 
Am29027 will re spond to a write operand R request by 
asserting signal DERR; the contents of Registers R and 
R-Temp will not be changed, and the R-Temp Valid and 
Operation Pending bits will retain their current values. 

Write Operand S: An operand is written to Input Regis- 
ter S and/or S-Temp. The most significant half of the 
64-bit operand to be written is placed on Input Bus R, 
the least significant half on Input Bus S. The action taken 
depends on signal DREQTo and on whether an accel- 
erator operation will be in progress during the next clock 
cycle. 
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Operation 

In progress Data 

DREQTo next written 

asserted clock cycle to 



Operation 
S-Temp pending 
valid bit bit 



No 
Yes 
Yes 



X 
No 
Yes 



S-Temp 

S-Temp, S 

S-Temp 



Set 

Reset 

Set 



Unchanged 

Reset 

Set 



If DREQTo is asserted and no accelerator operation will 
be in progress during the next clock cycle, a new opera- 
tion will be started on the next rising edge of CLK. 

If mode register bit HE (Halt On Error Enable) is High 
and an unmasked exception has been detected, the 
Am29027 will re spond to a write operand S request by 
asserting signal DERR; the contents of Registers S and 
S-Temp will not be changed, and the S-Temp Valid and 
Operation Pending bits will retain their cun-ent values. 

Write Operands R, S: Two 32-bit operands are written 
to Registers R and S and/or Registers R-Temp and S- 
Temp. The 32-bit operand to be written to Registers R or 
R-Temp is placed on Input Bus R; the 32-bit operand to 
be written to Registers S or S-Temp is placed on Input 
Bus 8. Each 32-bit word is written to both the upper and 
lower halves of the target register. The action taken 
depends on signal DREQTo and on whether an accel- 
erator operation will be in progress during the next clock 
cycle. 





Operation 










In progress 


Data 


R-,S- 


Operation 


DREQTo 


next 


written 


Temp 


pending 


asserted 


clock cycle 


to 


valid bits 


bit 



No 


X 


R-Temp 
S-Temp 


Yes 


No 


R-Temp 

S-Temp 

R.S 


Yes 


Yes 


R-Temp 
S-Temp 



Set Unchanged 



Reset 



Set 



Reset 



Set 



if DREQTo is asserted and no accelerator operation will 
be in progress during the next clock cycle, a new opera- 
tion will be started on the next rising edge of CLK. 

If mode register bit HE (Halt On Error Enable) is High 
and an unmasked exception has been detected, the 
Am29027 will resp ond to a write operands R, S request 
by asserting signal DERR; the contents of Registers R, 
R-Temp, S, and S-Temp will not be changed, and the R- 
Temp Valid, S-Temp Valid, and Operation Pending bits 
will retain their current values. 

Write IVIode: A 64-bit word is written to the mode regis- 
ter. The least significant half of the mode word is placed 
on Input Bus R, the most significant half on Input Bus S. 
The state of signal DREQTo is a "don't care" for this 
transaction request. 



Write Status: A 32-bit word is written to the status regis- 
ter and the status word to be written is placed on input 
Bus R. Asserting signal DREQTo will produce an addi- 
tional action that is mode-dependent. In flow-through 
mode, asserting DREQTo will cause the operation cur- 
rently specified by the instruction register to be retimed; 
operation results will not be written to the status register 
or the register file. In pipeline mode, asserting DREQTo 
will invalidate the ALU pipeline. 

Write Register File Precisions: A 32-bit word indicat- 
ing the precisions of register file locations RF7-RF0 is 
written to the precision register; the precision word to be 
written is placed on input Bus R. The state of signal 
DREQTo is a "don't care" for this transaction request. 

Write Instruction: A 32-bit accelerator instruction is 
written to the instruction register and/or Register i- 
Temp. The 32-bit instruction is taken from input signals 
I31-I0. The action taken depends on signal DREQTo, and 
on whether an accelerator operation will be in progress 
during the next clock cycle. 



DREQTo 
asserted 


Operation 
In progress 

next 
clock cycle 


Data 

written 

to 


l-Temp 
valid bit 


Operation 

pending 

bit 



No 


X 


l-Temp 


Set 


Unchanged 


Yes 


No 


l-Temp 

instruction 

register 


Reset 


Reset 


Yes 


Yes 


l-Temp 


Set 


Set 



If DREQTo is asserted and no accelerator operation will 
be in progress during the next clock cycle, a new opera- 
tion will be started on the next rising edge of CLK. 

If mode register bit HE (Halt On Error Enable) is High 
and an unmasked exception has been detected, the 
Am29027 will respond to a write in stnjction transaction 
request by asserting signal DERR; the contents of Reg- 
ister l-Temp and the instruction register will not be 
changed, and the l-Temp Valid and Operation Pending 
bits will retain their current values. 

Advance Temp Registers: The contents of the R- 
Temp, S-Temp, and l-Temp registers are transferred to 
Register R, Register S, and the instruction register, re- 
spectively. The state of signal DREQTo is a "don't care" 
forthis transaction request. The advance temp registers 
transaction request is used during restoration of accel- 
erator state. 

Read Transaction Requests 

Read transactions transfer data from the Am29027 to 
the Am29000. When data is to be transferred, the 
Am29000: 
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■ Issues the appropriate transaction request on 
signals OPTs-OPTo, and asserts signal R/WHigh. 

■ Places its data bus drivers in a high-impedance 
state. 

The Am29027 then places the requested data on sig- 
nals F31-F0 and issues two status signals: 



DRDY indicates that the data requested is available 
on Output Bus F31-F0. 

DERR indicates that the Am29027 has detected an 
unmasked exception; the exception may or may not 
be related to the data requested. 



DRDY and DERR may both be activ e at the same time; 
if so, t he Am29000 will respond to DERR and ignore 
DRDY. 

Timing for read transactions is illustrated in Appendix D. 

There are five read transactions: 

Read Result MSBs: The 32 MSBs of Register F are 
placed on output bus F. Asserting signal DREQTo will 
suppress the reporting of unmasked exceptions. 

Read Result LSBs: The 32 LSBs and 32 f^SBs of 
Register F are placed on Output Bus F in consecutive 
clock cycles. Asserting signal DREQTo will suppress the 
reporting of unmasked exceptions. The read result 
LSBs request must always be followed by a read result 
MSBs request. 

Read Flags: The flag register contents are placed on 
Output Bus F; bits Fai-F? will be logic Low. Asserting 
signal DREQTo will suppress the reporting of unmasked 
exceptions. 

Read Status: The status register contents are placed 
on Output Bus F; bits F31-F11 will be logic Low. Asserting 
signal DREQTo will suppress the reporting of unmasked 
exceptions. 

Save State: The contents of the instruction register, 
mode register, status register, register file, precision 
register, and Registers R, R-Temp, S, S-Temp, and I- 
Temp are transferred to the A m2900 via Output Bus F. 
Exception reporting via signal DERR is suppressed; the 
state of signal DRETQo is a "don't care." Further details 
on the use of this request appear in the Saving and Re- 
storing State sections. 

Coprocessor Data Accept 

The Coprocessor Data Accept (CDA) signal indicates to 
the Am29000 that the A m2902 7 is able to accept new 
operands or instaictions. CDA is normally Low (active), 
but will go High if: 

■ the Am29027 has an operation currently in 
progress and a completely specified pending 
operation waiting in the temporary registers. 



or 



Am29027 

Ifthe Am29027 issues any write transaction request and 
CDA is active Low, t he tran saction request will complete 
in a single cycle. If CDA is High, response to a write 
transaction request depends on request type: 

B For the write operand R, write operand S, write 
operands R, S, and write instoi ction transaction 
requests, the Am29027 will assert CDA active when 
it is able to accept new data. If it is not able to accept 
new data indefinitely due to presence of an 
unmasked exception (Halt On Error mode enabled) , 
it will respond to th e transaction request by 
asserting signal DERR. 

B For the write mode, write status, write register file 
precisions, and advance temp registers trans- 
action requ ests, the Am29027 will temporarily 
assert CDA during the cycle after the request is 
issued, regardless of whether an operation is in 
progress or an unmasked exception has halted the 
accelerator. 

CDA pertains only to write transaction requests; for read 
trans action requests, the Am29000 ignores the state of 
CDA. 

Data Ready 



The Data Ready (DRDY) signal indicates to the 
Am29000 that the Am29027 is pla cing da ta on the F out- 
put bus. The Am29027 generates DRDY in response to 
the read result f\4SBs, read result LSBs, read status, 
read flags, and save state transaction requests. 

For the read result MSBs, read result LSBs, read flags, 
and read status transaction requests, there is usually a 
minimum of one cycle delay bet ween t he time the 
request is issued and the time that DRDY is asserted. 
The only exception to this rule is when a read result 
LSBs request is immediately followed by a read result 
MSBs request, in which case the Am29027 responds to 
the second request in a single cycle. If the Am29027 is 
unable to respond immediately to a read transaction 
request, as m ay be t he case when an operation is in 
progress, the DRDY signal will be held inactive until 
such a time as the requested data can be output For the 
save state transaction request, the delay be tween the 
issuance of the transaction request and the DRDY re- 
sponse varies according to the specific data requested. 



DRDY pertains only to re ad tran saction requests; for 
write transaction requests, DRDY remains inactive. 

Data Error 



The Am29027 has halted in response to an 
unmasked exception (Halt On Error mode enabled). 



The Data Error (DERR) signal indicates to the Am29000 
that the Am29027 is unable to respond to a transaction 
request normally, due to the presence of an unmasked 
exception bit in the status register. 

For read transaction requests, read result LSBs, read 
result M SBs, re ad flags, and read status, the Am29027 
asserts DERR if the status register contains an un- 
masked exception bit. The Am29000 may suppress 
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error reporting for these requests by issuing them with 
signal DREQTo asserted. 

For write transaction requests, write operand R, write 
operan d S, write operands R, S, and write instmction, 
DERR is issued in the presence of an unmasked excep- 
tion if Halt On Error Mode is enabled in such an event, 
the contents of the target registers are left unchanged. 

DERR is never issued in response to transaction re- 
quests write mode, write status, write register file preci- 
sions, advance temp registers, and save state. 

Accelerator Instruction Set 

The ALU performs 57 arithmetic and logic instructions. 
Input operands for these instructions can be taken from 
Input Registers R and S, registerfile locations RF7-RF0, 
and on-board constant stores. At the user's option, 
results can be stored in register file locations RF7-RF0. 

Instruction Word 

The 32-bit instruction word, IN31-IN0, specifies the op- 
eration to be performed by the ALU. The instmction 
word is stored in the instruction register; instruction reg- 
ister format is shown in Figure 7. In flow-through mode, 
the instruction word specifies the operation to be per- 
formed by the entire ALU. In pipeline mode, the instmc- 
tion word specifies the operation to be performed by the 
first pipeline stage; the remaining pipeline stage or 
stages are controlled by their respective pipeline regis- 
ters. The instmction word also specifies input operand 
sources, result destination, and operand precisions. 

An instmction word comprises five sections: base op- 
eration code, sign-change selects, operand precision 
selects, operand source selects, and register file 
controls. 

Base Operation Code 

The base operation code consists of the core operation 
field (CO), which specifies the type of operation to be 
performed, and the integer/floating-point select bit (IF), 
which specifies whetherthe operation is integerorfloat- 
ing-point. Available base operation codes and the corre- 
sponding values for CO and IF are listed in Table 2. Note 
that the value of IF is a "don't care" for base operation 
code MOVE P. 

Sign-Change Selects 

Each ALU input and output port has associated hard- 
ware that can be used to modify operand signs (see Fig- 



ure 8). These sign-change blocks, when applied to base 
operations, greatly increase the number of available 
operations. The base operation code F' = P' + T', for 
example, can be used to perform operations such as 
P - T, ABS(P) + ABS(T), ABS(P + T), and others, simply 
by rrxjdifying the signs of the input and output operands. 
The SIP, 810, and SIT instmction word fields control the 
sign-change blocks for the P, Q, and T input operands, 
respectively; the SIQ and SIF fields control the sign 
change block for output operand F. 

Using the sign-change blocks, the sign of an input oper- 
and may be left unchanged, inverted, set Low, or set 
High; the sign of the output operand may be left un- 
changed, inverted, set Low, set High, set to the sign of 
the P input operand, or set to the sign of the T input oper- 
and. Select codes for the P, O, T, and F sign-change 
blocks are shown in Tables 3, 4, 5, and 6, respectively. 

Operand Precision Selects 

The Am29027 supports mixed-precision operations; it is 
possible, for example, to perform an operation having 
single-precision inputs and a double-precision output, 
or one single- and one double-precision input, or any 
other combination. 

The precision of the operands in Registers R and S 
is specified by instmction bit IPR, which is logic High for 
single-precision operands and logic Low for double-pre- 
cision operands. Note that the operands in the R and S 
registers must have the same precision if they are to be 
used in the same operation. This restriction does not 
preclude performing an operation with mixed-precision 
input operands, as there are no restrictions on the preci- 
sions of operands stored in the register file. The preci- 
sion of each operand stored in the register file is 
recorded in the precision register; this precision infor- 
mation is automatically supplied to the ALU when a 
register file location is specified as an input operand to 
an operation. 

The precision of an operation result is specified by in- 
stmction bit RPR, which is set High for a single-precision 
result, and Low for a double-precision result. Should the 
instmction word specify that the result is to be written to 
the register file (instruction bit RF High), the resulting 
precision will be written to the appropriate precision reg- 
ister bit when the result is written to the register file. 
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Table 2. Operation Codes 



IF 


CO 




IN, 


IN4 


IN, 


IN, 


IN, 


IN, 


Base Operation Code (Floating-Point) 




















F'-P 



















F'-P' + r 



















F'-PxQ' 





u 












Compare P, T 











1 








Max.P.T 











1 







Min P. T 











1 







Convert T to integer 











1 






Scale T to integer by Q 



















F'-(P'xQ') + T' 


















Round T to Integral Value 


















Reciprocal Seed of P 

















Convert T to Alternate F. P. Format 










1 








Convert T from Alternate F. P. Format 


IN5 


IN* 


IN, 


IN, 


IN, 


IN, 


Base Operation Code (Integer) 


















F = P 

















FoP + T 


















F=Px Q 

















Compare P, T 










1 







MaxP.T 










1 






MinP.T 










1 







Convert T to Floating-Point 










1 






Scale T to Floating-Point by Q 

















F=PORT 
















F = PANDT 

















F-PXORT 
















Shift P Logical Q Places 









1 








Shift P Arithmetic Q Places 









1 







Funnel Shift PT Logical Q Places 


IN, 


IN4 


IN, 


IN, 


IN, 


INo 


Base Operation Code (Special) 


X 


1 


1 











MOVEP 
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64 



A 



Sign Change 



64 



/". 



Sign Ciiange 



P' 



64 



Sign Change 



Q' 



r 



ALU Core 



ALU 



F' 



Sign Change 



/. 



64 



Figure 8. ALU Sign-Change Blocks 
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Table 3. Select Codes for P Operand 
Sign-Change Block 



Table 4. Select Codes for Operand 
Sign-Change Block 



Table 5. Select Codes for T Operand 
Sign-Change Block 



SIP 




INn 


IN,2 


SIGN (P) 





1 
1 



1 


1 


SIGN(P) 
SIGN(P) 



1 



SIT 




IN. 


INs 


SIGN (T) 












SIGN(T) 









1 


SIGN (T) 


1 












1 






1 


1 



Table 6. Select Codes for F Operand 
Sign-Change Block 



SIQ 




IN,, 


IN,, 


SIGN (Q') 





1 
1 



1 


1 


SIGN(Q) 

SIGN (Q) 



1 





SIQ 


SIF 






Base Operation 


IN„ 


IN,, 


IN, 


IN, 


SIGN(F) 


F=P 
(Floating-Point) 





X 








SIGN(F') 


F = P (Integer) 

Maximum P, T 

OR 
Minimum P, T 






1 
1 


X 
X 
X 



1 




1 
1 

X 
X 


1 



1 

X 
X 


SIGN(P) 



1 
SIGN(P) 
SIGN(T) 




X 


X 








SIGN(P) 


All Other Base 
Operations 


X 
X 
X 


X 
X 
X 




1 

1 


1 



1 


SIGN(P) 


1 
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Operand Source Selects 

Instruction fields PMS, QMS, and TMS specify the 
select codes for the P, Q, and T operand multiplexers, 
respectively; these codes are summarized in Table 7. 

The P, Q, and T operand multiplexers can indepen- 
dently select Register R, Register S, register file loca- 
tions RF7-RF0, or one of six predefined constants. For 
operations with floating-point inputs, constants 0, 0.5, 1 , 
2, 3, and pi are available; for operations with integer in- 
puts, constants 0, -1, 1, 2, 3, and -(2") are available. 
These constants are supplied to the ALU as double-pre- 
cision numbers, independent of the precisions specified 
for other input and result operands. Hexadecimal values 
for the constants are listed in Table 8. 

Register File Controls 

Instruction fields RF and RFS control the storing of op- 
eration results In the registerfile. If registerfile enable bit 
RF is High, the result of the operation specified by the 
instruction word will be stored in register file location 
RFS, where RFS is a number from 7 to 0; the precision 
of the result, as specified by the RPR bit, will be written 
to the appropriate bit in the precision register. If RF is 
Low, the operation result is written to neither the register 
file nor the precision register. 

Accelerator Operations 

Table 9 illustratesanumberof possible ALU instructions 
and corresponding values for instruction word fields 
SIP, SIQ, SIT, SIF, IF, and CO. Note that the remaining 
instruction fields— RF, RFS. PMS, QMS, TMS, IPR, and 
RPR— can be specified independently. 

The user may create additional instructions using 
instruction words other than those listed in Table 9. For 



Am29027 

some base operation codes, sign-change control set- 
tings SIP, SIQ, SIT, and SIF are completely arbitrary; 
for others, only the sign-change field values shown in 
Table 9 are valid. Table 10 summarizes permissible 
sign-change field values for each base operation code. 

Table 7. Select Codes for P, Q, and T 
Operand Multiplexers 



PMS 


IN„ 


iN„ 


1N2, 


IN« 


p 


QMS 


IN23 


IN22 


iN« 


INjo 


Q 


TMS 


IN,, 


IN,, 


IN,7 


1N,6 


T 
















Register R 















Register S 










1 





(Zero) 










1 




0.5 (F.P.)-1 (integer) 







1 








1 







1 







2 







1 


1 





3 







1 


1 




7c(F.P.)-2"(integer) 















RFo 














RF, 









1 





RFa 









1 




RF3 






1 








RF4 






1 







RFs 






1 


1 





RFe 






1 


1 




RF, 



1-139 



29K Family CMOS Devices 



Table 8. Hexadecimal Values for On-Chip Constants 



IEEE Floating-Point Constant 



Hexadecimal Representation 





0.5 

1 

2 

3 



0000000000000000 
3FE0OOO0000O00O0 
3FFOOO00000O000O 
4000000000000000 
4008000000000000 
400921 FB54442D1 8 



DEC D Floating-Point Constant 



Hexadecimal Representation 





0.5 

1 

2 

3 



0000000000000000 
4000000000000000 
4080000000000000 
4100000000000000 
4140000000000000 
41490FDAA22168C2 



DEC G Floating-Point Constant 



Hexadecimal Representation 




0.5 

1 

2 
3 



0000000000000000 
4000000000000000 
4010000000000000 
4020000000000000 
4028000000000000 
402921 FB54442D1 8 



IBM Floating-Point Constant 



Hexadecimal Representation 





0.5 

1 

2 

3 



0000000000000000 
4080000000000000 
4110000000000000 
4120000000000000 
4130000000000000 
413243F6A8885A31 



Integer Constant 



Hexadecimal Representation 




-1 
1 
2 
3 

_2S3 



0000000000000000 
FFFFFFFFFFFFFFFF 
0000000000000001 
0000000000000002 
0000000000000003 
8000000000000000 
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Table 9. 


Instruction Words for Typical ALU Operations 






Operation 


SIP 


SIQ 


SIT 


SIP 


IF 


CO 


FPP 


00 


00 


XX 


00 





00000 


FP-P 


00 


00 


XX 


01 





00000 


FPABS{P) 


00 


00 


XX 


10 





00000 


FP Sign(T) x ABS(P) 


00 


11 


XX 


XX 





00000 


FPP + T 


00 


XX 


00 


00 





00001 


FPP-T 


00 


XX 


01 


00 





00001 


FPT-P 


01 


XX 


00 


00 





00001 


FP-P-T 


01 


XX 


01 


00 





00001 


FP ABS(P + T) 


00 


XX 


00 


10 





00001 


FPABS{P-T) 


00 


XX 


01 


10 





00001 


FP ABS{P) + ABS(T) 


10 


XX 


10 


00 





00001 


FPABS(P)-ABS(T) 


10 


XX 


11 


00 





00001 


FPABS[ABS(P)-ABS(T)] 


10 


XX 


11 


10 





00001 


FP P X Q 


00 


00 


XX 


00 





00010 


FP(-P)xQ 


01 


00 


XX 


00 





00010 


FPABS(PxQ) 


00 


00 


XX 


10 





00010 


FP Compare P, T 


00 


XX 


01 


00 





00011 


FP Max P. T 


00 


00 


01 


00 





00100 


FP Max ABS(P). ABS(T) 


10 


00 


11 


00 





00100 


FPMinP.T 


01, 


00 


00 


00 





00101 


FP Min ABS(P). ABS(T) 


11 


00 


10 


00 





00101 


FP Limit P to Magnitude T 


11 


10 


10 


XX 





00101 


FP Convert T to Integer 


XX 


XX 


00 


00 





00110 


FP Scale T to Integer by Q 


XX 


00 


00 


00 





00111 


FPT + PxQ 


00 


00 


00 


00 





01000 


FPT-PxQ 


01 


00 


00 


00 





01000 


FP-T + PxQ 


00 


00 


01 


00 





01000 


FP-T-PxQ 


01 


00 


01 


00 





01000 


FP ABS(T) + ABS(P x Q) 


10 


10 


10 


00 





01000 


FP ABS(T) - ABS(P x Q) 


11 


10 


10 


00 





01000 


FPABS(PxQ)-ABS(T) 


10 


10 


11 


00 





01000 


FP Round Tto Integral Value 


XX 


XX 


00 


00 





01001 


FP Reciprocal Seed (P) 


00 


XX 


XX 


00 





01010 


FP Convert T to Alternate 














Floating-Point Format 


XX 


XX 


00 


00 





01011 


FP Convert T from Alternate 














Floating-Point Format 


XX 


XX 


00 


00 





01100 


intP 


00 


00 


00 


00 




00000 


int-P 


00 


00 


00 


01 




00000 


int ABS(P) 


00 


00 


00 


10 




00000 


int Sign(T) x ABS(P) 


00 


11 


00 


XX 




00000 


int P + T 


00 


XX 


00 


00 




00001 


int P - T 


00 


XX 


01 


00 




00001 


intT-P 


01 


XX 


00 


00 




00001 


intABS(P + T) 


00 


XX 


00 


10 




00001 


intABS(P-T) 


00 


XX 


01 


10 




00001 


int PxQ 


00 


00 


XX 


00 




00010 


int Compare P, T 


00 


XX 


01 


00 




00011 


int Max P, T 


00 


00 


01 


00 




00100 


int Min P, T 


01 


00 


00 


00 




00101 
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Table 9. Instruction Words for Typical ALU Operations (continued) 




Operation 




SIP 


SIQ 




SIT 




SIF 


IF 


CO 


int Convert T to Float 




XX 


XX 




00 




00 




00110 


int Scale T to Roat by Q 




XX 


00 




00 




00 




00111 


int PORT 






XX 


XX 




XX 




XX 




10000 


int P AND T 






XX 


XX 




XX 




XX 




10001 


intPXORT 






XX 


XX 




XX 




XX 




10010 


int NOT T (see Note 1) 




XX 


XX 




XX 




XX 




10010 


int Shift P Logical Places 




00 


00 




XX 




00 




10011 


int Shift P Arithnnetic O Places 


00 


00 




XX 




00 




10100 


int Funnel Shift PT Places 


00 


00 




00 




00 




10101 


f^OVE P 






XX 


XX 




XX 




XX 


X 


11000 


Note 1. 


NOT T is performed by XORing T with a 


word containing all 1 s 


(integer 


- 1). When invoking NOT T the user must set 




instruction field PMS to 001 U, thus selecting integer constant - 


1. 










Table 10. Allowable Sign-Change Combinations 


IF 




CO 


Operation 






SIP 




SIG 




SIT 


SIF 







00000 


FP P = P 






F 




V 




X 


V 







00001 


FPP = P' + T' 






V 




X 




V 


V 







00010 


FP P = P' X Q' 






V 




V 




X 


V 







00011 


FP Compare P, T 






F 




X 




F 


F 







00100 


FP Max P, T 






F 




F 




F 


F 







00101 


FP l^in P, T 






F 




F 




F 


F 







00110 


FP Convert T to Integer 




X 




X 




F 


F 







001 1 1 


FP Scale T to Integer 




X 




F 




F 


F 







01000 


FPP = (P'xQ') 


fT 




V 




V 




V 


V 







01001 


FP Round T 






X 




X 




F 


F 







01010 


FP Reciprocal Seed P 




F 




X 




X 


F 







01011 


FP Convert T to Alt Format 




X 




X 




F 


F 







01100 


FP Convert T from Alt Format 


X 




X 




F 


F 






00000 


intF = P 






F 




F 




F 


F 






00001 


intF-P + T 






F 




X 




F 


F 






00010 


intF = PxQ 






F 




F 




X 


F 






00011 


int Compare P, T 






F 




X 




F 


F 






00100 


intMaxP, T 






F 




F 




F 


F 






00101 


int Min P. T 






F 




F 




F 


F 






00110 


int Convert T to F.P. 




X 




X 




F 


F 






00111 


int Scale T to F.P 






X 




F 




F 


F 






10000 


int F = PORT 






X 




X 




X 


X 






10001 


intF = PANDT 






X 




X 




X 


X 






10010 


int F = P XOR T 






X 




X 




X 


X 






10011 


int Shift P Logica 






F 




F 




X 


F 






10100 


int Shift P Arithmetic 




F 




F 




X 


F 
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Key: V = Variable; user can specify arbitrary sign change. 

F = Fixed; user is restricted to sign-change combinations shown in Table 9. 
X = Don't care; this field does not affect the operation or its result. 
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Base Operation Code Description 

F' = P (Floating-Point): Thie P-operand is passed 
througii ttie ALU unchianged, except for any specified 
precision conversions. If the user specif ies different in- 
put and output precisions, the operation may be used to 
perform single-to-double or double-to-single conver- 
sions. Instructions such as negation, absolute value ex- 
traction and sign transfer may be executed by setting 
the sign-change controls appropriately while executing 
this base operation. 

F' = P' + T' (Floating-Point): The two operands P' and 
T are added, tal<ing into account any specified precision 
conversions. Instructions such as subtraction, sum-of- 
absolute-values, difference-of-absolute-values, abso- 
lute-value-of-sum, and absolute-value-of-difference 
may be executed by setting the sign-change controls 
appropriately while executing this base operation. 

F' = P' X Q' (Fioating-Point): The operands P' and Q' 
are multiplied, taking into account any specified preci- 
sion conversions. Instructions such as negative-product 
and absolute-value-of-product may be executed by set- 
ting the sign-change controls appropriately while exe- 
cuting this base operation. 

Compare P, T (Fioating-Point): The two operands P 
and T are compared, taking into account any specified 
precision conversions. The output of the operation is the 
result of the subtraction (P - T). The flags are set appro- 
priately to indicate the result of the comparison, con- 
forming to the relevant parts of the floating-point 
standards. For IEEE and DEC operations, one of four 
flags (greaterthan, less than, equal to, or unordered) is 
set for any given compare operation. For IBM opera- 
tions, the unordered flag does not apply since the format 
does not support reserved operands. 

Maximum P, T (Fioating-Point): The two operands P 
and T are compared, taking into account any specified 
precision conversions. The most positive operand is se- 
lected as the output. The Winner flag indicates which of 
the operands is selected. Additionally, the operation 
maximum-of-absolute-value may be performed by set- 
ting the appropriate sign-change controls. 

IViinimum P, T (Fioating-Point): The two operands P 
and T are compared, taking into account any specified 
precision conversions. The most negative operand is 
selected as the output. The Winner flag indicates which 
of the two operands is selected. Additionally, the opera- 
tions minimum-of-absolute-values and limit-P-to-mag- 
nitude-T may be performed by setting the appropriate 
sign-change controls. The limit-P-to-magnitude-T op- 
eration is useful for clipping a sequence of operands to 
ensure that their magnitude never exceeds a preset 
limit. 

Convert T to integer (Fioating-Point): The operand T 
Is converted from floating-point representation to two's 
complement integer representation, taking into account 
the specified precision of the floating-point operand. If 
the output precision is specified as single, the result is a 
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32-bit integer. If the output precision is specified as dou- 
ble, the result is a 64-bit integer. 

Scale T to Integer by Q (Floating-Point): The operand 
T is converted from floating-point representation to 
two's complement integer representation, using the 
exponent of the floating-point operand Q as a scale 
factor and taking into account the specified precision of 
the floating-point operands. The unbiased exponent of 
the operand Q is added to the exponent of the operand 
T, permitting IEEE and DEC operands to be multiplied 
by any power of 2, and IBM operands by any power 
of 16, before the conversion is performed. If the output 
precision is specified as single, the result is a 32-bit inte- 
ger. If the output precision is specified as double, the 
result is a 64-bit integer. 

F' = (P'x Q') + T' (Floating-Point): The operands P' and 
Q' are multiplied, producing a double-precision product. 
This product is added to the operand T', taking into ac- 
count any specified precision conversions. Instructions 
such as P X Q - T. T - P X Q, ABS (P x Q) -i- ABS(T) and 
ABS(P X Q + T) may be executed by setting the sign- 
change controls appropriately while executing this base 
operation. 

Round T to Integral Value (Floating-Point): The float- 
ing-point operand T is rounded to an integer-valued 
floating-point operand, using the specified rounding 
mode and taking into account any specified precision 
conversions. As an example, the operation converts a 
floating-point representation of Pi (3.14159 . . . ) to a 
floating-point representation of 3.0 or 4.0, depending on 
the rounding mode selected. The final result of the op- 
eration is a floating-point number. 

Reciprocal Seed of P (Floating-Point): An approxima- 
tion to the reciprocal of the operand P is evaluated, 
taking into account any specified precision conversions. 
The reciprocal seed comprises an accurate sign, a fully- 
accurate exponent and a mantissa that is accurate to 
only one place. This operation can be used as the initial 
step in performing Newton-Raphson division; option- 
ally, an external seed look-up table can be used for 
faster convergence. 

Convert Tto Alternate Floating-Point Format (Float- 
ing-Point): The floating-point operand T, assumed to 
be in the primary floating-point format, is converted to a 
floating-point operand in the alternate floating-point 
format, taking into account any specified precision 
conversions. 

Convert T from Alternate Floating-Point Format 
(Floating-Point): The floating-point operand T, as- 
sumed to be in the alternate floating-point format, is 
converted to a floating-point operand in the primary 
floating-point format, taking into account any specified 
precision conversions. 

F = P (Integer): The P-operand is passed through the 
ALU unchanged except for any specified precision 
conversions. If the user specifies different input and out- 
put precisions, the operation may be used to perform 
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single-to-double or double-to-single conversions. In- 
stnjctions such as negation, absolute value extraction, 
and sign transfer may be performed by setting the sign- 
change control appropriately while executing this base 
operation. 

F = P + T (Integer): The two operands P and T are 
added, taking into account any specified precision 
conversions. Instructions such as subtraction, absolute- 
value-of-sum, and absolute-value-of-difference maybe 
performed by setting the sign-change controls appropri- 
ately while executing this base operation. 

F = P X Q (Integer): The two operands P and Q are mul- 
tiplied, taking into account any specified precision con- 
versions. Either 32-bit multiplication or 64-bit multiplica- 
tion may be performed, and the user may select either 
the MSBs or the LSBs of the product as the final result. 
In addition, format-adjusting may be implemented if 
required, and the operands may be considered as 
signed (two's complement) or unsigned. 

Compare P, T (integer): The two operands P and T are 
compared, taking into account any specified precision 
conversions. The output of the operation is the result of 
the subtraction (P-T). The flags are set appropriately to 
indicate the result of the comparison, one of three flags 
(greater than, less than, or equal to) being set for any 
given compare operation. 

Maximum P, T (integer): The two operands P and 
T are compared, taking into account any specified preci- 
sion conversions. The most positive operand is selected 
as the output. The Winner flag indicates which of the two 
operands is selected. 

Minimum P, T (Integer): The two operands P and T are 
compared, taking into account any specified precision 
conversions. The most negative operand is selected as 
the output. The Winner flag indicates which of the two 
operands is selected. 

Convert T to Floating-Point (Integer): The operand T 
is converted from two's complement integer representa- 
tion to floating-point representation, taking into account 
the specified precision of the integer operand. If the 
output precision is specified as single, the result is a 
32-bit floating-point operand. If the output precision is 
specified as double, the result is a 64-bit floating-point 
operand. 

Scale T to Floating-Point by Q (Integer): The operand 
T is converted from two's complement integer represen- 
tation to floating-point representation, using the expo- 
nent of the floating-point operand Q as a scale factor 
and taking into account the specified precision of the in- 
teger operand. The unbiased exponent of the operand 
Q is added to the exponent of the floating-point result, 
permitting IEEE and DEC operands to be multiplied by 
any power of 2, and IBM operands by any power of 16 
after the conversion is performed. If the output precision 
is specified as single, the result is a 32-bit floating-point 
operand. If the output precision is specified as double, 
the result is a 64-bit floating-point operand. 



F = P OR T (Integer): The operand P is logically ORed 
with the operand T. Before the operation is performed, 
the inputs, if 32-bit, are sign-extended to 64 bits. 

F = P AND T (Integer): The operand P is logically 
ANDed with the operand T. Before the operation is per- 
formed, the inputs, if 32-bit, are sign-extended to 64 bits. 

F = P XOR T (Integer): The operand P is logically exclu- 
sive-ORed with the operand T. Before the operation is 
performed, the inputs, if 32-bit, are sign-extended to 64 
bits. This operation may be used to invert an operand by 
selecting the second operand to be the integer constant, 
-1, so that all bits of this second operand are 1. 
Exclusive-ORing an operand with -1 is equivalent to 
inverting each bit in the operand. 

Shift P Logical Places (Integer): This operation can- 
not be performed in mixed-precision mode. The preci- 
sion of the result is the same as the precision of the input 
operand P. A two's-complement shift length in the range 
-64 to -I-63 (double-precision) or -32 to +31 (single-pre- 
cision) is extracted from the LSBs of the operand Q. The 
operand P is logically right-shifted by the number of 
places specified by the shift length. A negative shift 
length therefore produces a left-shift. If a right-shift is 
performed, Os fill vacated bit positions to the left of the 
input operand. If a left-shift is performed, Os fill vacated 
bit positions to the right of the input operand. 

Shift P Arithmetic Q Places (Integer): This operation 
cannot be performed in mixed-precision mode. The pre- 
cision of the result is the same as the precision of the in- 
put operand P. A two's-complement shift length in the 
range -64 to +63 (double-precision) or -32 to +31 (sin- 
gle-precision) is extracted from the LSBs of the operand 
Q. The operand P is arithmetically right-shifted by the 
number of places specified by the shift length. A nega- 
tive shift length therefore produces a left-shift. If a right- 
shift is performed, the MSB (bit 63 or 31 ) is replicated to 
fill vacated bit positions to the left of the input operand. If 
a left-shift is performed, Os fill vacated bit positions to the 
right of the input operand. 

Funnel Shift PT Q Places (integer): This operation 
cannot be perfomied in mixed-precision mode. The op- 
erand T is interpreted as having the same precision as 
the input operand P, and the precision of the result is 
also the same as the precision of the input operand P. A 
two's-complement shift length in the range -64 to +63 
(double-precision) or -32 to +31 (single-precision) is 
extracted from the LSBs of the operand Q. A triple-width 
operand (96-bit or 192-bit) is formed by concatenating 
the input operands into the arrangement P-T-P, with the 
32-bit or 64-bit result field initially aligned with the T-op- 
erand. The triple-width operand is logically right-shifted 
by the number of places specified by the shift length. A 
negative shift length therefore produces a left-shift. 

Move P (Floating-Point or Integer): The 64-bit 
operand P is passed unchanged through the ALU. No 
exceptions are detected or signaled. 
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Primary and Alternate Floating-Point Formats 

Two mode register fields, PFF and AFF, specify the pri- 
mary and alternate floating-point formats used by the 
ALU. All floating-point operations except format conver- 
sions are performed in the format specified by PFF. For 
format conversion operations, either primary floating- 
point format PFF or alternate floating-point format AFF 
are used as follows: 

■ For conversions between floating-point and integer 
formats (base operation codes Convert! to integer, 
Convert T to floating-point, Scale T to integer by Q, 
Scale T to floating-point by Q), the floating-point 
source or destination format is specified by PFF; for 
the scale operations, the format of operand Q is also 
specified by PFF. 

■ When converting from the primary floating-point 
format to the alternate floating-point format (base 
operation code Convert T to alternate F. P. format), 
an operand in format PFF is converted to format 
AFF. 

■ When converting from the alternate floating-point 
format to the primary floating-point fomriat (base 
operation code Convert T to primary F.P. format), 
an operand in format AFF is converted to format 
PFF. 

Operation Precision 

The ALU performs all operations in double-precision 
format. All single-precision input operands are con- 
verted to double-precision equivalents by the ALU at 
the start of an operation. If the operation is to report a 
single-precision result, the ALU converts the double- 
precision internal result to single-precision at the end of 
the operation. 

Note that operation flags and exception bits pertain to 
the source and destination precisions. If, for example, 
an operation produces a single-precision overflowed re- 
sult, an overflow is indicated regardless of whether that 
result overflows the double-precision internal format. 

Operation Flags 

For each operation, the ALU produces thirteen flags. Of 
these, a maximum of seven are relevant to any given op- 
eration. The relevant flags are placed in the flag register 
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in the manner shown in Table 11. All flags are active 
High. In flow-through mode the flag register Is made 
transparent, and the selected flags are presented di- 
rectly to the output multiplexer. 

The ALU flags are: 

C— CARRY: Carry-out bit produced by integer addition, 
subtraction, or comparison. 

I— INVALID OPERATION: Indicates that the input 
operands are unsuitable for the operation performed 
(e.g.,ooxO). 

R— RESERVED OPERAND: Indicates that the opera- 
tion result is a reserved operand. Reserved operands in- 
clude signaling or quiet NaNs in IEEE format, and DEC 
reserved operands in DEC D or G formats. 

S — SIGN: Result sign; Low for a non-negative result. 
High for a negative result. 

U— UNDERFLOW: Indicates that the operation result 
underflowed the destination format. 

V — OVERFLOW: Indicates that the operation result 
overflowed the destination format. 

W— WINNER: Indicates which of two input operands is 
reported as the result of the MAX P, T and MIN P, T op- 
erations. A logic High indicates that operand T is re- 
ported as the result, a logic Low operand P. 

X— INEXACT RESULT: Indicates that the operation re- 
sult had to be rounded to fit the destination format. 

Z— ZERO RESULT: Indicates that the operation pro- 
duced a zero result. Note that the result is exactly zero 
only if the Z flag is High and the X flag is Low. 

>, =, <, #— GREATER THAN, EQUAL TO, LESS 
THAN, UNORDERED: Used to report the result of an 
operation with the Compare P, T base operation code. 
The Greater Than flag indicates that P > T, the Equal To 
flag that P = T, and the Less Than flag that P < T. The 
Unordered flag indicates that one or both input oper- 
ands are reserved operands and cannot be compared. 
Note that the Unordered flag cannot arise when compar- 
ing IBM floating-point operands or integers. Exactly 
one comparison flag will be active per comparison 
operation. 
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Table 11. Organization of Flags 
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Table 11. Organization of Fiags (continued) 
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Note: Unused flags assume the Low state. 
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Updating the Status Register 

The status register exception bits are updated at tiie 
conclusion of each operation in flow-through mode, and 
at the start of each operation in pipeline mode. An ex- 
ception bit is updated only if the operation reports that 
exception with a flag. For example, an IEEE floating- 
point addition operation produces an overflow flag and 
would therefore update the overflow exception bit; an 
IEEE floating-point comparison operation, on the other 
hand, does not produce an overflow flag and would 
therefore leave the overflow exception bit unchanged. 

The mode register exception mask bits do not affect the 
updating of the status register exception bits— masl<ed 
exceptions still appear in the status register, i-iowever, 
a masked exception will not set the exception status 
bit(ES). 

Operation Sequencing 

The Am29027 can be configured for either pipelined 
or flow-through (unpipelined) operation. Flow-through 
mode is normally selected for performing scalar opera- 



tions; pipeline mode provides high throughput for vector 
operations. The manner in which operations are se- 
quenced depends on the nwde currently invoked. 

Operation in Flow-Through Mode 

Flow-through mode is invoked by setting mode register 
bit PL (Pipeline fwlode Select) to logic Low. 

Programmer's Model 

A programmer's model of the Am29027 in flow-through 
mode is shown in Figure 9. Note that Output Register F 
and the flag register are made transparent in this mode. 

Performing Operations 

Flow-through mode operations are performed by: 



Storing instructions and/or operands 
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Figure 15. Programmer's Model for Flow-Through Mode 
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Storing instructions and operands can be done in any of 
three ways: 

■ Writing the instruction oniy, and starting the 
operation : This is appropriate when all necessary 
operands are already present in the Am29027, 
as is sometimes the case when using on-board 
constants or the results of previous operations 
stored in the register file. 

■ Writing the operands only, and starting the 
operation: This is appropriate when the desired 
instruction is already present in the Am29027, as is 
the case when performing the second of two 
identical operations. 

■ Writing the instruction and operands, and 
starting the operation: This is appropriate 
whenever the next operation requires both a new 
instruction and new operands. 

Operands and instructions are written using the write 
operand R, write operand S, write operands R, S, and 
write instruction transaction requests. Operands and 
instructions can be written to the Am29027 in any order, 
with the operation start bit (DREQTo High) accompany- 
ing the last of the transaction requests. 

Loading an operation result is performed using the read 
result MSBs, read result LSBs, and read flags trans- 
action requests. The specific request used depends on 
whether the result of an operation is a flag or flags (as is 
the case with comparison operations) or data (as is the 
case with most other operations). In cases where the 
operation result is stored in the register file, the user 
may elect not to read the result but to proceed with the 
next operation. 

Operation Timing 

The Am29027 will usually start a flow-through operation 
during the first cycle following the receipt of a write 
operand R, write operand S, write operands R, S, or 
write instruction transaction request having signal 
DREQTo set High. 

Operation execution begins with the transfer of the con- 
tents of the R-Temp, S-Temp, and l-Temp registers to 
Register R, Register S, and the instruction register, re- 
spectively; only those temporary registers written to as 
part of the operation specification will be transferred. 
The operand or instruction accompanying the transac- 
tion request that starts the operation (that is, the trans- 
action request for which signal DREQTo is High) is writ- 
ten directly to the appropriate working register, that is. 
Register R, Register S, or the instruction register. 

Once started, an operation will proceed for the number 
of cycles specified by mode register fields MATC, 
MVTC, and PLTC; MATC specifies the number of cycles 
for base operation code (P x Q) + T, MVTC the number 
of cycles for base operation code MOVE P, and PLTC 
the number of cycles for all other base operation codes. 
At the end of the last operation cycle, the status register 
exception bits and exception status bit will be updated 
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and, optionally, the operation result will be written to the 
register file and precision register. 

There are two conditions for which the Am29027 will not 
start an operation immediately. The first condition is 
when an operation is already in progress. In this case 
the new operation is kept pending in the l-Temp, 
R-Temp, and S-Temp registers until the current opera- 
tion is completed, at which time the new operation be- 
gins. The second condition is when a previous operation 
creates an unmasked exception in Halt On Error mode 
(mode register bit HE High). In this case the new opera- 
tion is kept in the l-Temp, R-Temp, and S-Temp regis- 
ters until the exception is cleared, at which time the new 
operation begins. 

Timing for typical accelerator operations in the flow- 
through mode is illustrated in Appendix D. 

Availability of Operation Results 

In order to directly read the result of an operation, the 
operation specification should be followed by the appro- 
priate read transaction request. Should the Am29000 
attempt to read an operation result before the operation 
is completed, the Am29027 will withhold ack nowled ging 
the tra nsaction request by holding signals DRDY and 
DERR inactive until the operation has been completed. 
All read transaction requests, including save state, will 
be held off in this manner. 

Overlapping Operations 

Due to the presence of the R-Temp, S-Temp, and 
l-Temp registers, it is possible to partially or completely 
specify a new operation while the previously specified 
operation is being performed. Execution of the new 
operation will begin immediately after the previous op- 
eration is completed. Execution begins with the transfer 
of the contents of the R-Temp, S-Temp, and l-Temp reg- 
isters to the corresponding working registers; only those 
temporary registers that have been written to as part of 
the operation specification are transferred. 

It is important to note that, once the new operation is 
completely specified, any attempt to read a result will be 
held off until the new operation is completed. This 
means that it is not possible to directly read the result of 
an operation if another operation is completely specified 
before the results of the first operation are read. If, for 
example, specification of operation 2.0 + 3.0 is immedi- 
ately followed by specification of operation 4.0 x 5.0, 
subsequent read result LSBs and read result MSBs 
transaction requests will return value 20.0, the result of 
the second operation. Similarly, a read flags transaction 
request will return flags for the second operation, and a 
read status transaction request will return status reflect- 
ing the completion of the second operation. This de- 
layed read feature is provided to eliminate ambiguity in 
the correspondence between operations and results. 

Should two operations be overlapped, and should the 
first operation have as its target a register file location, 
the second operation can be completely specified be- 
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fore the first operation is completed. If the first operation 
produces a result that is to be read directly by the 
Am29000, the second operation can be partially speci- 
fied before the result of the first operation is read. A 
partial operation specification is one that includes all but 
the last operand or instruction. 

Timing for typical overlapped operations in flow-through 
mode is illustrated in Appendix D. 

Saving and Restoring State 

In flow-through mode, the complete state of the 
Am29027 can be saved and restored with the save state 
transaction request. The first save state transaction 
request will return the contents of the instmction regis- 
ter; subsequent requests will return the contents of 
Registers I-Temp, R, S, R-Temp, S-Temp, the status 
register, the precision register, register file locations 
RFt-RFo, and the mode register. The user has the op- 
tion of saving only part of the state by issuing only the 
number of save state transaction requests needed 
to save registers of interest. When issuing a series of 
save state transaction requests, data is returned in the 
following order: 



Request 



Data Returned 



1 


Instruction 


2 

3 


I-Temp 
RLSBs 


4 


RMSBs 


5 


S LSBs 


6 


SMSBs 


7 

8 

9 

10 


R-Temp LSBs 
R-Temp MSBs 
S-Temp LSBs 
S-Temp MSBs 


11 


Status 


12 


Precision 


13 


RFo LSBs 


14 


RFo MSBs 


27 


RF7 LSBs 


28 


RF7 MSBs 


29 


Mode LSBs 


30 


Mode MSBs 



Sequencing for the save state transaction request is 
reinitialized when the Am29000 issues any transaction 
request other than save state. If, for example, the 
Am29000 issues a write operand R transaction request 
after a series of save state requests, the next save state 
request will return the contents of the instruction 
register. 

It should be noted that the process of saving state alters 
the contents of the instruction register and Registers R 
andS. 

Error reporting via signal DERR is suppressed for the 
save state transaction request. 

Accelerator state is restored using transaction requests 
in concert with the MOVE P base operation code. Before 
restoring state, all status register bits should be set to 
logic Low using the write status transaction request to 
prevent the possibility of an unmasked exception bit 
inhibiting the restore sequence. The accelerator oper- 
and and instruction registers can then be restored, 
followed by restoration of the status register using the 
write status transaction request, with signal DREOTo as- 
serted to indicate the end of the restore sequence. 
When state restoration is complete, the Am29027 will 
retime the operation specified by current instruction 
register contents. 
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Accelerator state is restored in the following order: 



Register to 
be restored 



Procedure for restoring 



Status Set all bits in the status register to a logic 

Low using the write status transaction 
request. 

Mode Write using write mode transaction 

request. 

RFo Write "Move R to RFo" instruction using 

write instruction transaction request. 

Write RFo value to Register R using write 
operand R transaction request, start opera- 
tion. 
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tions of state restoration are the initial clearing of the 
status register, and restoration of the status register with 
signal DREQTo asserted to indicate completion of the 
restore sequence. 

Error Recovery 

Six exception bits — invalid operation, reserved oper- 
and, overflow, underflow, inexact result, and zero re- 
sult — are maintained in the status register; these bits 
are updated upon completion of an operation. Exception 
bits can be masked individually by programming the ap- 
propriate bits in the mode register; if the corresponding 
mask bit is inactive (logic Low), the exception bit is said 
to be unmasked and contributes to error reporting. The 
Am29027 provides three mechanisms with which un- 
masked exceptions can be handled. 



RF7 Write "Move R to RF?" instruction using 

write instruction transaction request. 

Write RF7 value to Register R using 
write operand R transaction request, start 
operation. 

Precision Guarantee that "Move R to RF7" operation 

has been completed by performing a read 
result MSBs transaction request. 

Write precisions using write register file 
precisions transaction request. 

R, S, Write R value to Register R-Temp 

Instruction using the write operand R transaction 

request. 

Write S value to Register S-Temp using the 
write operand S transaction request. 

Write instruction value to Register l-Temp 
using write instruction transaction request. 

Transfer contents of Registers R-Temp, S- 
Temp, and l-Temp to Register R, Register 
S, and the instruction register, respectively, 
using the advance temp registers transac- 
tion request. 

Write R-Temp value to Register R-Temp 
using the write operand R transaction 
request. 

Write S-Temp value to Register S-Temp 
using the write operand S transaction 
request. 

Write i-Temp value to Register l-Temp us- 
ing the write instruction transaction 
request. 

Status Write status to status register using the 

write status transaction request, with signal 
DREQTo asserted to indicate that the re- 
store sequence is complete. 

The user may elect to restore only those registers rele- 
vant to a particular application by omitting parts of the 
state restoration sequence. The only mandatory por- 



R-Temp, 
S-Temp, 
l-Temp 



Reporting Errors Upon Read 

If an unmasked status register exception bit is s et, the 
Am29027 will signal an error by asserting signal DERR 
when the Am29000 performs a read result LSBs, read 
result MSBs, read flags, or read status transaction re- 
quest. Error reporting can be suppressed by issuing any 
of these transaction requests with signal DREQTo 
asserted. 

Halt On Error Mode 

Should the application require, the Am29027 can be 
configured to halt operation upon detection of an un- 
masked exception; this mode is invoked by setting 
mode register bit HE (Halt On Error) High. Once config- 
ured this way, the Am29027 will respond to an un- 
masked exception as follows: 

■ Signal CDA will become inactive upon completion 
of the operation producing the unmasked 
exception. 

■ Should the operation producing the unmasked 
exception specify that the operation result be stored 
on-chip, that is, in the register file, the result will not 
be written to its destination. 

■ A pending operation will not be started; the 
operands and/or instruction for that operation will 
remain in the appropriate temporary registers. 

■ If the Am29000 attempts to start a new operation 
during the last cycle of the operation that produces 
the unmasked exception by issuing a write operand 
R, write operand S, write operands R, S, or write 
instruction transaction request with DREQTo 
asserted, and if no other operation is pending, the 
operand or instruction will be written to the 
appropriate temporary register rather than to the R, 
S, or instruction register. 

■ Once CDA is deasserted, the Am29027 will respond 
to the write operand R, write operand S, write 
operands R, S, and write i nstruct ion transaction 
requests by asserting signal DERR one cycle after 
the request is issued; the contents of the target 
register or registers will remain unchanged. 
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Through these measures, the Am29027 will retain the 
input operands and instructions for the operation caus- 
ing the exception. The input operands will be retained in 
the R register, S register, or register file locations, 
and the instructions will be retained in the instruction 
register. Additionally, the R-Temp, S-Temp. and l-Temp 
registers may contain the operands and instructions 
for a partially or fully specified pending operation. The 
Am29000 can recover these operands and instructions 
with the save state transaction request; this infor- 
mation can then be given to an error-handling routine for 
resolution. 

The error halt condition is removed by clearing the 
status register exception status (ES) bit and the excep- 
tion bit or bits responsible for producing the halt. 



Repo rting E rrors via EXCP 

Signal EXCP will go active Low in the presence of an un- 
masked exception. This signal can be connected to an 
Am29000 trap or exception input signal, and is enabled 
or disabled independent of other exception handling 
mechanisms with mode register bit EX. 

Writing to tiie Mode, Status, and 
Precision Registers 

Unlike the R, S, and instruction registers, the mode, 
status, and precision registers are not preceded by tem- 
porary registers. Accordingly, writing to these registers 
may produce undesirable or unpredictable side effects if 
an accelerator operation is in progress at the time. To 
avoid such side effects, a write to any of these registers 
should be preceded by a read transaction request, 
which will guarantee that any current or pending accel- 
erator operations will have been completed before the 
write transaction request is issued. 

Writing to ttie Register Fiie 

The numerical result of any operation may be written to 
the register file by specifying the desired destination in 
instruction field RFS and setting instruction bit RF High. 
The result can then be used as an input operand for sub- 
sequent operations. 

It is permissible for an operation result to be placed in a 
register file location that previously contained an input 
operand for that operation. In such a case, however, it is 
not permissible for the Am29000 to directly read the re- 
sult, status, or flags for that operation, as the writing of 
the result modifies the operation performed by the ALU. 

Determining Timer Counts 

To provide optimum accelerator performance over a 
range of possible system clock frequencies, the timing 
of Am29027 operations is programmable. Three mode 
registerfields — pipeline timercount(PLTC),timercount 
for the Multiply-Accumulate Operation (MATC), and 
timer count for the f\/IOVE P Operation (MVTC) — must 
be programmed according to system clock frequency 
and accelerator speed. 



PLTC 

PLTC specifies the number of cycles allotted to opera- 
tions other than those using base operation codes 
(P X Q ) +T or MOVE P. This count can assume values 
between 3 and 15, inclusive, and must be given a value 
that satisfies the relationship: 



[8]<PLTCx[i], 



where 



[8] = Operation time, flow-through 
mode, all other base operation 
codes 
and [1]= CLK period, 

as described in the Switching Characteristics table. 

MATC 

IVIATC specifies the number of cycles allotted to opera- 
tions that use base operation code F' = (P' x Q') + T'. 
This count can assume values between 3 and 15, in- 
clusive, and must be given a value that satisfies the 
relationship: 



where 



and 



[6]<MATCx [1], 

[6]= Operation time, flow-through 

mode, F' = (P'xQ') + T' 
(1]= CLK period. 



as described in the Switching Characteristics table. 

MVTC 

MVTC specifies the number of cycles allotted to opera- 
tions that use the MOVE P base operation code. This 
count can assume values between 3 and 15, inclusive, 
and must be given a value that satisfies the relationship: 



where 



and 



[7]<tVlVTCx[1]. 

[7]= Operation time, flow-through 

mode. MOVE P 
[1]= CLKperiod. 



as described in the Switching Characteristics table. 



ADVANCiNGQRDy 

Normally, an operation result produced by the Am29027 
in flow-through mode is read by the Am29000 no sooner 
than the clock cycle following operation completion. De- 
pending on the system clock frequency used, it may be 
advantageous to overlap the reading of the result with 
the last cycle of the operation. Consider, for example, a 
system with a 45-ns clock cycle and an Am29027 that 
performs an operation in 240 ns. The pipeline timer 
count PLTC will have to be set to a minimum of 6 for 
such a system, and the Am29000 will read a result 
no sooner than during the seventh clock cycle after the 
start of an operation. 



Mode register bit DA, DRDY Adva nce, ca n b e used to 
advance transaction status signals DRDY and DERRby 
a full clock cycle, thus allowing the Am29000 to read 
data one clock cycle earlier than would othenwise be 
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possible. For the example given above PLTC remains at 
6. but the Am29000 can read data during the sixth clock 
cycle after the operation starts rather than the seventh, 
thus saving a clock cycle. 



In order to advance DRDYand DERR, the following sys- 
tem timing conditions must be met: 

[19]<(MATC X [1])-[x9B]-lgate] 
[20]<(MVTCx [1])^x9B]-[gate] 
[21 ]< (PLTC x[1])-[x9B]- [gate] 

where l"^^]- Data operation-start-to-output 
valid delay , F' = P' X Q' + T' 
[20] = Data operation-start-to-output 

valid delay, MOVE P 
[21]= Data operation-start-to-output 
valid delay, all other operations 
and [1] = CLK period 

as described in the Switching Characteristics table 
and 

[x9] = Synchronous input setup time 

as described in the Switching Characteristics table of 
the Am29000 Preliminary Data Sheet (order #09075). 

The term [gate] repres ents th e delay of the external 
gate through which the DERR signal passes. 

Timing for a typical accelerator operation with DRDY 
advanced is illustrated in Appendix D. 

Operation In Pipeline Mode 

Pipeline mode is invoked by setting mode register bit PL 
(Pipeline f\/lode Select) to logic High. 

Programmer's Model 

A programmer's model of the Am29027 in pipeline 
mode Is shown in Figure 10. Note that Output Register F 
and the flag register are non-transparent in this mode, 
thus permitting the overlap of the current operation(s) 
with the reading of the result for a previous operation. 

Pipeline Delays 

When placed in pipeline mode, the ALU is divided into 
three pipeline stages for multiply-accumulate opera- 
tions, and into two stages for ail other operations. The 
ALU configuration for pipeline mode is shown in 
Figure 1 1 . Note that for multiplication-accumulation op- 
erations, multiplicand P and multiplier Q enter the first 
pipeline stage, while addend T enters the second pipe- 
line stage. As a consequence, the source for operands 
P and Q must be specified in the corresponding multiply- 
accumulate instruction, while the source for operand T 
must be specified in the following instruction. 

Pipeline Advance 

The ALU pipeline is advanced whenever a new opera- 
tion begins. One consequence of this advance criterion 
is that data does not fall through the pipe but instead is 
"pushed" through. If, for example, an addition is per- 
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formed in pipeline mode, the pipe must be advanced 
twice (by starting two operations) before the result of the 
addition appears in Register F, the flag register, the 
status register, and, optionally, a register file location. 

Performing Operations 

Pipeline mode operations are perfomried by: 

■ Storing instructions and/or operands in the 
Am29027, and starting the operation 

■ Loading the result of a previous operation 

Storing instructions and operands can be done in any of 
three ways: 

■ Writing the instructions only, and starting the 
operation: This is appropriate when all necessary 
operands are already present in the Am29027, 
as is sometimes the case when using on-t>oard 
constants or the results of previous operations 
stored in the register file. 

■ Writing the operands only, and starting the 
operation: This is appropriate when the desired 
instructions are already present in the Am29027, as 
is the case when performing the second of two 
identical operations. 

" Writing the instructions and operands, and 
starting the operation: This is appropriate 
whenever the next operation requires both new 
instructions and new operands. 

Operands and instructions are written using the write 
operand R, write operand S, write operands R, S, and 
write instruction transaction requests. Operands and 
instructions can be written to the Am29027 in any order, 
with the operation start bit (DREQTo High) accompany- 
ing the last of the transaction requests. 

Loading the result of a previous operation is performed 
using the read result l\/1SBs, read result LSBs, and read 
flags transaction requests. The specific request used 
depends on whether the result is a flag or flags (as is the 
case with comparison operations) or data (as is the case 
with most other operations). In cases where the 
operation result is stored in the register file, the user 
may elect not to read the result, but to proceed with the 
next operation. 

Operation Timing 

The Am29027 will usually start a pipelined operation 
during the first cycle following the receipt of a write op- 
erand R, write operand S, write operands R, S, or write 
instruction transaction request having signal DREQTo 
set High. 

Operation execution begins with the transfer of the con- 
tents of the R-Temp, S-Temp, and l-Temp registers to 
Register R, Register S, and the instruction register, re- 
spectively; data is transferred only from those tem- 
porary registers written to as part of the operation speci- 
fication. The operand or instruction accompanying the 
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Figure 16. Programmer's Model for Pipeline Mode 



transaction request that starts the operation (that is, the 
transaction request for which signal DREQTo is High) is 
written directly to the appropriate worthing register, that 
is. Register R, Register S, or the instruction register. At 
the start of the operation, the output of the last ALU pipe- 
line stage is transferred to Register F, the flag register, 
and, optionally, to a register file location; the status 
register exception status and exception bits are 
updated. The outputs of all other ALU pipeline stages 
are written to their respective pipeline registers. 

Once started, an operation will proceed for the number 
of cycles specified by mode register field PLTC, which 
denotes the number of cycles needed for data to tra- 
verse a single pipeline stage. 



There are two conditions for which the Am29027 will not 
start an operation immediately. The first condition is 
when an operation has been started recently and has 
not yet had time to settle at the output of the first pipeline 
stage. In this case the new operation is kept pending in 
the l-Temp, R-Temp, and S-Temp registers until the 
previous operation completes the first pipeline stage. 
The second condition is when a previous operation cre- 
ates an unmasked exception in Halt On Error mode 
(mode register bit HE High). In this case the new opera- 
tion is kept in the l-Temp, R-Temp, and S-Temp regis- 
ters until the exception is cleared, at which time the new 
operation will begin. 
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Figure 17. ALU Configuration for Pipeline IVIode 



Timing for typical accelerator operations in the pipeline 
mode is iilustratedin Appendix D. 

A vailability of Operation Results 

Because Register F, the flag register, and the status 
register are updated at the beginning of an operation, 
these registers can be read at any time after an opera- 
tion begins. 

Overlapping Operations 

Due to the presence of the R-Temp, S-Temp, and I- 
Temp registers, it is possible to partially or completely 
specify a new operation while the previously specified 
operation is propagating through the first ALU pipeline 
stage. Execution of the new operation will begin immedi- 
ately after the previous operation completes the first 
pipeline stage. Execution begins with the transfer of the 
contents of the R-Temp, S-Temp, and l-Temp registers 
to the corresponding worthing registers; only those 
temporary registers that have been written to as part of 
operation specification are transferred. 

It is important to note that, once the new operation is 
completely specified, any attempt to read a result will be 
held off until the new operation begins; this means that it 
is not possible to read the result that is placed in the out- 
put registers when the first operation begins. If, for 
example, result X is placed in Register F when an op- 



eration starts and if another operation is completely 
specified thereafter, subsequent read result MSBs and 
read result LSBs transaction requests will return not X, 
but the result placed in the F register when the second 
operation begins; the read flags and read status trans- 
action requests will behave in lil<e manner. This delayed 
read feature is provided to eliminate ambiguity in the 
correspondence between operations and results. 

Saving and Restoring State 

Due to the presence of ALU pipeline registers, it is not 
possible to save the complete state of the Am29027 in 
pipeline mode. Pipeline operations may therefore be in- 
terrupted only under special circumstances, such as: 

■ If the interrupting routine does not use the 
floating-point accelerator 

or 

■ If the current series of pipelined operations has 
been completed, and any operands needed for 
future operations have already been transferred to 
theAm29000 

The save state transaction request is disabled in pipe- 
line mode. It is permissible to switch to flow-through 
mode and use the save state transaction request, but 
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doing so does not permit tlie saving of Register F, tiie 
flag register, or the ALU pipeline registers. 

Error Recovery 

As for flow-through mode, the Am29027 provides three 
mechanisms with which unmasl<ed exceptions can be 
handled. 

Reporting Errors Upon Read 

If an unmasked status register exception bit is s et, the 
Am29027 will signal an error by asserting signal DERR 
when the Am29000 performs a read result LSBs, read 
result MSBs, read flags, or read status transaction re- 
quest. Error reporting can be suppressed by issuing any 
of these transaction requests with signal DREQTo 
asserted. 

Halt On Error Mode 

Should the application require it, the Am29027 can be 
configured to halt operation upon detection of an un- 
masked exception; this mode is invoked by setting 
mode register bit HE (Halt On Error) High. Once config- 
ured this way, the Am29027 will respond to an un- 
masked exception as follows: 

■ Signal CDA will become inactive when the results of 
the operation producing the unmasked exception 
are transferred from the last pipeline stage to 
Register F, the flag register, and the status register. 

■ Once CDA is deasserted, the Am29027 will respond 
to the write operand R, write operand S, write 
operands R, S, and write i nstruct ion transaction 
requests by asserting signal DERR one cycle after 
the request is issued; the contents of the target 
register or registers will remain unchanged. 

Through these measures, the Am29027 will retain the 
input operands and instructions for the most recently 
started operation. The input operands for that operation 
will be retained in the R register, S register, or register 
file locations, and the instmctions will be retained in the 
instruction register. Additionally, the R-Temp, S-Temp, 
and l-Temp registers may contain the operands and in- 
structions for a partially or fully specified pending opera- 
tion. Note that the input operands and instructions 
words for the operation causing the exception, as well 
as for operations currently in the ALU pipeline, will not 
be available. At the user's option, this information can 
be stored in a circular queue in the Am29000 register 
file so that full recovery from a pipelined exception is 
possible. 

The Am29000 can read the contents of Am29027 oper- 
and and instruction registers by invoking flow-through 
mode and using the save state transaction request. 
Note that the contents of Register F, the flag register, 
and the ALU pipeline registers will be lost. This informa- 
tion can then be given to an error-handling routine for 
resolution. 



The error halt condition is removed by clearing the 
status register exception status (ES) bit and the excep- 
tion bit or bits responsible for producing the halt. 



Reporting Errors via EXCP 

Same as for the flow-through mode. 

Pipeline Invalidation 

There are several situations for which the ALU pipeline 
stages may contain invalid data. The Am29027 recog- 
nizes these situations and invalidates results automati- 
cally; results mar1<ed as invalid will not update the 
status register, register file locations RFt-RFo, or the 
precision register. Results are invalidated for the follow- 
ing conditions: 

■ The Am29027 is switched from flow-through mode 
to pipeline mode. Any data present in the ALU at the 
time of the switch is marked as invalid. This 
invalidation is illustrated in Figure 12a. 

■ The Am29027 performs a multiply-accumulate 
operation that is preceded by an operation other 
than multiply-accumulate. The multiply-accumulate 
operation result and the result that precedes it will 
be separated by a spurious result, due to the 
insertion of an additional pipeline stage for the 
multiply-accumulate operation. The spurious result 
is marked invalid. This invalidation is illustrated in 
Figure 12b. 

The pipeline may also be invalidated manually by issu- 
ing a write status transaction request with signal 
DREQTo asserted High; this request invalidates all cur- 
rent pipeline contents. Pipeline invalidation does not ap- 
ply to operation in flow-through mode. 

Writing to the Mode, Status, and Precision 
Registers 

Unlike the R, S, and instruction registers, the mode, 
status, and precision registers are not preceded by tem- 
porary registers. Accordingly, writing to these registers 
may produce undesirable or unpredictable side effects if 
an accelerator operation is pending at the time. To avoid 
such side effects, a write to any of these registers should 
be preceded by a read transaction request, which will 
guarantee that any pending accelerator operation will 
have started before the write transaction request is 
issued. 

The mode register outputs are not pipelined in the ALU, 
that is, all pipeline stages receive mode information 
directly from the mode register. Accordingly, writing to 
the mode register may produce undesirable or unpre- 
dictable side effects for operations currently in the ALU 
pipeline. To avoid such side effects, a write to the mode 
register should be performed only if the contents of the 
ALU pipeline are a "don't care," that is, only after the last 
operation result of interest has been written to Register 
F, the flag register, or a registerfile location. If, for exam- 
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i i i i i i i i 



Start Operation 

Operation | 1 | 2 

Pipeline Stage 1 | 1 | 2 

Pipeline Stage 2 1 1 | 2 

Result I 1 I 2 



l^^Pipeiine Output ^\ 
l*~ Invalid ~*\ 



Switch to 
Pipeline Mode 

a. Pipeline invalidation timing for switch from flow-through to pipeline mode. Operations shown incur 
two pipe-line delays in pipeline mode [all base operations except F' = (P' x Q') + T']. 



Start 



Operation! iiiiijiiiiii 



Operation | ADD1 1 MPY1 1 MACl| MAC2 | MAC3 | (DMAC)| ADD2 | MPY2 

Pipeline Stage 1 1 ADD1 1 MPY1 1 MACl| MAC2 | MAC3 | (DIVIAC)| ADD2 | MPY2 
Pipeline Stage 2 1 | ADD1 1 MPYl| MAC1 | MAC2 | MAC3 | (DMAC)| ADD2 

Pipeline Stage 3 1 | | j ? | MAC1 | MAC2 | MAC3 | 

Result I I I ADD1 1 MPY1 | ? | MAC1 | MAC2 | MAC3 

Pipeline Output ^i \ ^ 

invalid "♦! I*~ 



1 ADD3 


MPY3 


ADD4 


MPY4| 


1 ADDS 


MPY3 


ADD4 


MPY4| 


1 MPY2 
1 


ADD3 


MPY3| 
1 


ADD4 1 
1 


1 

1 ADD2 


MPY2 


1 

ADD3 


1 
MPY3| 



b. Pipeline invalidation timing for muitipiy-accumuiate operations in pipeline mode. 

Notes: ADDx = addition operation 

MPYx = multiplication operation 

MACx = multiply-accumulate operation 

(DMAC) = dummy multiply-accumulate operation 
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Figure 18. Pipeline Invalidation Timing 



pie, the last in a series of addition operations has 
just been started, the mode register should not be writ- 
ten until the pipeline is advanced twice, placing that 
operation's results in the F register, flag register, and, 
optionally, a register file location. 

Writing to the Register File 

The numerical result of any operation may be written to 
the register file by specifying the desired destination in 



instruction field RFS and setting instruction bit RF High. 
The result may then be used as an input operand in sub- 
sequent operations. Because all ALU operations incur 
one or more pipeline delays, the result of an operation 
will not be available for use by the very next operation. 

It is permissible for an operation result to be placed in a 
register file location that previously contained an input 
operand for that operation. 
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Multiplication-Accumulation Operations 

The pipeline structure of the Am29027 permits the 
evaluation of sum-of -products expressions in a canoni- 
caiiy efficient manner by interleaving the evaluation of 
two sum-of-product expressions. Operation sequencing 
is described in Figure 13. 

Determining Timer Counts 

As for flow-through mode, the timing of operations in 
pipeline mode is programmable to accommodate 
variations in system timing. A single mode register 
field— pipeline timer count (PLTC) — specifies the timing 
of all pipelined operations; fields MATC and MVTC are 
not used. 

PLTC specifies the number of cycles allotted for data to 
traverse a single pipeline stage. This count can assume 
values between 2 and 15, inclusive, and must be given a 
value that satisfies the relationship: 

[9]<PLTCx[i], 
where 

[9]= Operation time, pipeline 
mode, all operations 
and [1]=CLK period, 

as described in the Switching Characteristics table. 



Advancing DRDY 

Because the Am29027 F register and flag register are 
non-transparent in pipeline mode, it is not possible (nor 
advantageous) to advance DRDY. Accordingly, mode 
register bit M44 has no effect in pipeline mode. 

Master/Slave Operation 

Two Am29027 accelerators can be tied together in mas- 
ter/slave configuration, with the slave checking the re- 
sults produced by the master. All input an d output sig- 
nals of the slave, with the exception of SLAVE and 
MSERR, are connected directly to the corresponding 
signals of t he mas ter. The master is selected by assert- 
ing sig nal SLAVE Low, the slave by asserting signal 
SLAVE High. 

The slave accelerator, by comparing its outputs to the 
outputs of the master accelerator, performs a compre- 
hensive check of master accelerator logic. In addition, if 
the slave accelerator is connected at the proper position 
on the Am29000 buses, it may detect open circuits and 
other faults in the electrical path between the master ac- 
celerator and the Am29000. 

Note that the master accelerator also performs a 
comparison between its outputs and its own internally 
generated results, and is therefore able to detect faults 
in its output drivers, which it reports with its fvlSERR 
signal. 



can begin. This is accomplished by asserting the RESET 
signal, which initializes accelerator state as follows: 

■ All bits in the status register are cleared 

■ The accelerator is placed in flow-through mode 



Signal CDA is active; signals DRDY and DERR are 
inactive 

All internal circuitry controlling operation timing is 
initialized 



The RESETsignal does not initialize the operand and in- 
struction registers and may corrupt existing register 
contents. It is the responsibility of the user to initialize 
these registers, if needed. 

Applications 

Suggestions for Power and Ground 
Pin Connections 

The Am29027 operates in an environment of fast signal 
rise times and substantial switching currents. Therefore, 
care must be exercised during circuit board design and 
layout, as with any high-performance component. The 
following is a suggested layout, but since systems vary 
widely in electrical configuration, an empirical evalu- 
ation of the intended layout is recommended. 

The Vcco and GNDO pins carry output driver switching 
currents and can be electrically noisy. The Vcc and GND 
pins, which supply the logic core of the device, tend to 
produce less noise and the circuits they supply may be 
adversely affected by noise spikes on the Vcc plane. For 
this reason, it is best to provide isolation between the 
Vcc and Vcco pins as well as independent decoupling for 
each. Isolating the GND and GNDO pins is not required. 

Printed Circuit-Board Layout Suggestions 

1 . Use of a multilayer PC board with separate power, 
ground, and signal planes is highly recommended. 

2. All Vcc and Vcco pins should be connected to the Vcc 
plane. Vcco pins should be isolated from Vcc pins by 
means of an isolation slot which is cut in the Vcc 
plane (see Figure 14). By physically separating the 
Vcc and Vcco pins, coupled noise will be reduced. 

3. All GND and GNDO pins should be connected 
directly to the ground plane. 

4. The Vcco pins should be decoupled to ground with a 
0.1 -fiF ceramic capacitor and a 10-nF electrolytic 
capacitor, placed as closely to the Am29027 as is 
practical. Vcc pins should be decoupled to ground in 
a similar manner. 

A suggested layout is shown in Figure 14. 



Initialization and Reset 

The accelerator is in an unknown state when power is 
first applied and must be initialized before processing 
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Operation | MAC 

Register R I all 

Registers | b1 

Pipeline Stage 1 |at1xb1 
Pipeline Stage 2 I 



MAC I MAC I MAC I MAC | MAC | MAC | MAC | MAC | MAC | MAC | MAC | MAC | MAC | MAC | MAC 
a21 I a12 | a22 | al3 | a23 | a14 | a24 | a31 | a41 | a32 | a42 | a33 | a43 ! a34 | a44 
b1 I b2 I b2 I b3 I b3 I b4 I b4 I b1 I b1 I b2 I b2 | b3 | b3 | b4 | b4 



a21xb1 I a12xb2 | a22xb2 | a13xb3 | a23xb3 | a14xb4 | a24xb4 j a3lxb1 | a4lxb1 | a32xb2 | a42xb2 | a33xb3 | a43xb3 | a34xb4 | a44xb4 

allxbl I a21xb1 |a12xb2 + |a22xb2+ |a13xb3+ | a23xb3+ | a14xb4+ | a24xb4+ | a31xb1 |a41xb1 | a32xb2 + | a42xb2+ | a33xb3+ | a43xb3+ | a34xb4+| a44xb4 + | | 

(=1) (c2) (CI) (c2) (CI) (c2) (e3) (c4) (c3) (c4) (c3) (c4) 

Pipeline Stage 3 I | |a11xb1 | a21xb1 |a12xb2+ |a22xb2+ | a13xb3 + | a23xb3+ |a14xb4+ | a24xb4+ | a31xb1 | a41 xbl | a32xb2+ | a42xb2+ | a33xb3* | a43xb3+| a34xb4+ |a44xb4* j 

(=1) (=2) (CI) (c2) (c1) (c2) (c3) (c4) (c3) (04) (c3) {c4) 

"''' II I I (d) I (C2) I (CI) I (C2) I (CI) I (C2) I C1 I C2 I (c3) I (c4) I (c3) | (c4) | (c3) | (c4) | c3 | C4 

"^O'*'®'^ I I I I (CI) I (e2) I (CI) I (=2) I (CI) I (c2) | cl | c2 | (c3) | (c4) | (c3) | (c4) | (c3) | (c4) | c3 | c4 



Calculate matrix product C » A x B, where: 



a11a12a13a14 
a21 a22 a23 a24 
a31 a32 a33 a34 
a41 a42 a43 a44 





— — 




_. 




b1 




cl 


B. 


b2 
b3 


C- 


c2 
c3 




b4 




c4 



cl t=a11 xbl ■t-a12xb2-t-a13xb3 + a14 xb4 
c2 = a21 xbl -f a22xb2+a23xb3 + a24xb4 
c3<=a31 xbl +a32xb2 + a33xb3-f a34xb4 
c4 = a41 xbl +a42xb2-t-a43xb3-t-a44xb4 



Notes: 1 . Register file location RFo is used as the accumulator. 

2. Parentheses are used to indicate partial sums of products. 

'Additional MAC operation needed to terminate sequence. 
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Figure 13. Canonically Efficient Sum-of-Products Evaluation In Pipeline Mode 



29K Family CMOS Devices 
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C7 



Vcc Isolation Cut 



Ci C2 C5 Ce C3 C4 

O = Through Hole 

^ = Vcc Plane Connection 

Ci = C3 = C5 = C7 = 0.1 ^lF (ceramic or monolithic capacitor) 
Cz = C4 = Ce = Ca = 1 jiF (electrolytic or tantalum capacitor) 

Figure 20. Suggested Printed Circuit-Board Layout 
(power and ground connections) 
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Am29027 
OPERATING RANGES 
Commercial (C) Devices 

Case Temperature (To) to +85°C 

Supply Voltage (Vcc) +4.75 V to +5.25 V 

Military* (M) Devices 

Case Temperature (Tc) -55 to +125°C 

Supply Voltage (Vcc) +4.5 V to +5.5 V 

Operating ranges define those limits between which the 
functionality of the device is guaranteed. 

•Military Product 100% tested at Tc=+25°C, +125°C, and 
-55°C. 



ABSOLUTE MAXIMUM RATINGS 

Storage Temperature -65 to +150°C 

(Ambient) Temperature Under Bias . . -55 to +125°C 
Supply Voltage to 

Ground Potential Continuous -0.3 V to +7.0 V 

DC Voltage Applied to Outputs for 

High Output State -0.3 V to +Vcc +0.3 V 

DC Input Voltage -0.3 V to +Vcc +0.3 V 

DC Output Current. Into Low Outputs 30 mA 

DC Input Current -10 mA to +10 mA 

Stresses above those listed under ABSOLUTE MAXI- 
MUM RATINGS may cause permanent device failure. 
Functionality at or above these limits is not implied. Ex- 
posure to absolute maximum ratings for extended peri- 
ods may affect device reliability. 
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DC CHARACTERISTICS over COMMERCIAL operating range unless otherwise specified 
(for APL Products, Group A, Subgroups 1, 2, and 3 are tested unless otherwise noted) 



Parameter 
Symbol 


Parameter 
Description 


Test Conditions (Note 1) 


lUlin. 


Max. 


Unit 


VoH 


Output Higli Voltage 


Vco = lVlin. 
V,N = V,HorV,L 


loH = -4.0 mA 


2.4 




V 


Vol 


Output Low Voltage 


Vcc = Min. 
V,N = V,HorV,L 


loL=4.0niA 




0.45 


V 


V,H 


Guaranteed Input Logical 
High Voltage (Note 2) 




2.0 




V 


V,L 


Guaranteed Input Logical 
Low Voltage (Note 2) 






0.8 


V 


V,h(F) 


Guaranteed Input Logical 
High Voltage (Notes 2. 6) 


F Bus, Slave Operation Only 


Vcc -0.5 




V 


v,aF) 


Guaranteed Input Logical 
Low Voltage (Notes 2, 6) 


F Bus, Slave Operation Only * 




0.5 


V 


l,L 


Input Leakage Current 


0.450 < V,N< Vcc -0.450. Vcc = Max. 




±10 


HA 


Ilo 


Output Leakage Current 


0.450 < VoutS Vcc -0.450, Vcc -Max. 




±10 


jiA 




Static Power Supply Current 


Vcc = Max. , 
lo=pnA ' 




. (Note 3) 

CMOSV,N = Vccor 

GND 




240 




Ice static 


(Note 3) 

TTLV,N = 0.5Vor 

2.4 V 




275 






MIL 

Tc = -55to 

+125°C 


(Note 3) 

CMOSV,N = Vccor 

GND 






mA 




(Note 3) 

TTLV,N = 0.5Vor 

2.4 V 








Iccop 


Operating Power Supply " ' 
Current 


Vcc = Max. 
Outputs floating 




9.0 


mA/MHz 



Notes: 1 . Vcc conditions shown as Min. or Max. refer to ±5% Vcc (commercial) and ±10% Vcc (military). 

2. These input levels provide zero noise immunity and should only be statically tested in a noise-free environment 
(not functionally tested). 

3. Use CMOS Ice when the device is driven by CMOS circuits and TTL Ice when the device is driven by TTL circuits. 

4. Ice (Total) = Ice (Static) + Iccop X f , Where f is in MHz. This is tested on a sample basis only. 

5. Tested on a sample basis only. 

6. These levels guaranteed compatible with F bus output levels. 



CAPACITANCE 


Parameter 
Symbol 


Parameter 
Description 


Test Conditions 


Min. 


Max. 


Unit 


C|N 


input Capacitance 


fc=1 MHz(Note5) 




12 


pF 


CouT 


Output Capacitance 




20 


pF 


Cuo 


I/O Pin Capacitance 




20 


pF 
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SWITCHING CHARACTERISTICS over COMMERCIAL operating range 










Parameter Description 


Test Conditions 


25 


MHz 


20 MHz 


16 MHz 




No. 


MIn. 


Max. 


Min. 


Max. 


Min. 


Max. 


Unit 


1 


CLK Period 


(Notel) 


40 


DC 


50 


DC 


60 


DC 


ns 


2 


CLK L^w Time 




18 




20 




22 




ns 


3 


CLK High Time 




18 




20 




22 




ns 


4 


CLK Rise Time 


(Note 2) 




5 




5 




5 


ns 


5 


CLK Fall Time 


(Note 2) 




5 


J 


5 




5 


ns 


6 

7 
8 


Operation Time, Low-Latency 

Mode. F' = (P'xQ')+T' 

MOVEP 

(All Other Base Operation Codes) 






280*;^ " 

, f20 • 

200 --^ 


> 


300 
150 
250 




360 
180 
300 


ns 
ns 
ns 


9 


Operation Time, Pipeline Mode 
All Operations 






'-' tsq =" 




150 




180 


ns 


10 


Transaction Request Setup Time 


(Note 3) 


■-. 20 


\j-' 


24 




26 




ns 


11 


Transaction Request Hold Time 


(Note 3) ' . 


.''' '''0^ 














ns 


12 


BINV Setup Time 


- '^ , 


'■11 




13 




15 




ns 


13 


BINV Hold Time 


'^ :-, 


' 2 




2 




2 




ns 


14 


Data Setup Time 


>' ,. (Note 4) 


18 




22 




24 




ns 


15 


Data Hold Time 


2 




2 




2 




ns 


16 


Instruction Setup Time 


. - (Notes) 


18 




22 




24 




ns 


17 


Instruction Hold Time 


2 




2 




2 




ns 


18 


CDA CLK-to-Output-Valid Delay 


"' 




20 




24 




26 


ns 


19 


F31-F0 CLK-to-Output-Valid Delay 






30 




35 




37 


ns 


20 


F31-F0 Three-State 
CLK-to-Output-lnactive Delay 


(Note 6) 




22 




25 




27 


ns 


21 
22 
23 


Data Operation-Start-to-Output- 

Valid Delay 

F' = (P'xQ')+T' 

MOVEP 

(All Other Base Operation Codes) 






270 
110 
190 




285 
135 
235 




340 
160 
280 


ns 
ns 
ns 


24 


DRDY CLK-to-Output-Valid Delay 






18 




21 




23 


ns 


25 


DERR CLK-to-Output-Valid Delay 




18 




21 




23 


ns 


26 


EXCP CLK-to-Output-Valid Delay 




18 




21 




23 


ns 


27 


MSERR CLK-to-Output-Valid 
Delay 






20 




25 




30 


ns 



Notes: 1. CLK switching characteristics are made relative to 1.5 V. 

2. CLK rise time/fail time measured between 0.8 V and (Vcc -1.0 V). Tested on a sample basis only. 

3. Transaction request signals include R/W, DREQ, DREQT.-DREQT,, and OPT2-OPT0. 

4. Data signals include Rst-Ro and Sji-So. 

5. Instruction signals include Ut-lo. 

6. Three-State Output Inactive Test Load. Three-State CLK-to-Output-lnactive Delay is measured as the time to a 
±500 mV change from prior output level. 

Conditions: A. All inputs/outputs are TTL-compatible for Vm, Viu and Vol unless otherwise noted. 

B. All outputs are driving 80 pF unless othenwise noted. 

C. All setup, hold, and delay times are measured relative to CLK at 1.5 V unless othen/vise noted. 
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SWITCHING CHARACTERISTICS 


over MILITARY operating range 










Parameter Description 


Test Conditions 


20 MHz 


16 MHz 




No. 


Min. 


Max. 


Min. 


Max. 


Unit 


1 


CLK Period 


(Note1) 


50 


DC 


60 


DC 


ns 


2 


CLK Low Time 




20 




22 




ns 


3 


CLK High Time 




20 




22 




ns 


4 


CLK Rise Time 


(Note 2) 




5 




5 


ns 


5 


CLK Fall Time 


(Note 2) 




■fK 5 




5 


ns 


6 

7 
8 


Operation Time, Low-Latency 
Mode, F' = (P'xQ') + T' 
MOVE P 
(All Other Base Operation Codes) 




.< 


. """300i 
"'i|250 




360 
180 
300 


ns 
ns 
ns 


9 


Operation Time, Pipeline Mode 
Ail Operations 




•61, ''%9i$ 


150 




180 


ns 


10 


Transaction Request Setup Time 


(Note 3) .>X 


|;aii24'' 




26 




ns 


11 


Transaction Request Hold Time 


(Note 3fkk'M 


^C% 









ns 


12 


BINV Setup Time 


4Mts%!£^-'v 


;* 14 




16 




ns 


13 


BlNV Hold Time 




2 




2 




ns 


14 


Data Setup Time 


jp'1jko\e4) 


22 




24 




ns 


15 


Data Hold Time ,;|:: 


2 




2 




ns 


16 


Instruction Setup Time .sSa 


1%. .4 ■■■■■■ 

%:^-'(No\e 5) 


22 




24 




ns 


17 


Instruction Hold Time ,.,,*'%ft'J''^l 


2 




2 




ns 


18 


CDA CLK-to-Output-Vafid Delay e»i 






24 




26 


ns 


13 


Fai-Fo CLK-to-Output-VatidlDelay 






35 




40 


ns 


20 


F31-F0 Three-State CLK-to- 
Output-inactive Delay 


(Note 6) 




26 




30 


ns 


21 
22 
23 


Data Operation-Start-to-Output- 
Valid Delay 

F' = (P'xQ') + T' 

MOVE P 

(All Other Base Operation Codes) 






285 
135 
235 




340 
160 
280 


ns 
ns 
ns 


24 


DRDY CLK-to-Output-Valid Delay 






21 




23 


ns 


25 


DERR CLK-to-Output- Valid Delay 




21 




23 


ns 


26 


EXCP CLK-to-Output- Valid Delay 




21 




23 


ns 


27 


MSERR CLK-to-Output-Valid Delay 






25 




30 


ns 



Notes: 1. CLK switching characteristics are made relative to 1.5 V. 

2. CLK rise time/fail time measured between 0.8 V and (Vcc-1 .0 V). Tested on a sample basis only. 

3. Transaction request signals include F^'W, DREQ, DREQT,-DREQTo, and OPTz-OPTo. 

4. Data signals include R31-R0 and S3,-So. 

5. instruction signals include iai-lo. 

6. Three-State Output Inactive Test Load. Three-State CLK-to-Output-lnactive Delay is measured as the time to a 
±500 mV change from prior output level. 

Conditions: A. All inputs/outputs are TTL-compatible for Vw, Vl, and Vol unless otherwise noted. 

B. All outputs are driving 80 pF unless otherwise noted. 

C. All setup, hold, and delay times are measured relative to CLK at 1 .5 V unless othenwise noted. 
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SWITCHING WAVEFORMS 



CLK 




Transaction 
Request 



BINV 



Data, 
Instruction 



CDA 



EXCP 



18-» 



1.5 V 



2&-# 



1.5 V 



10 



1.5 V 



12 



1.5 V 



14 



1.5 V 



11- 



13- 



■15- 



1.5 V 



:1.5V 



.1.5 V 



Input Signal Timing; CDA, EXCP Timing 
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SWITCHING WAVEFORMS (continued) 



start of 
Opgration 



CLK 



F31— Fo 



DRDY 
DERR 



EXCP 



6, 7. 8- 



_y^^IV\__^^^-^^-\__^^'r^7\ 



'TeTS X Note1 X X^K 



-e^ 



X 



ja^ 



(1.5 V 



24, 
25 



V£I 



26 

iJvV 



.20. 



>L VoH -0.5 V 



/oL +0.5V 



M 24. 
^ 25 



1.5 V 



(Note 3) 



Operation Timing for Flow-Through IVIode, DRDY, DERR Not Advanced 
(Mode Reg ister Bit AD = 0) 



Notes: 1 . Transaction request Write Operand R; Write Operand S; Write Operands R, S; or Write Instruction with Signal 
DREQTo asserted. 

2. Transaction Request Read Result MSBs, Read Result LSBs, Read Flags, Read Status, or Save State. If re- 
quest Re ad Res ult LSBs is issued, the Am29027 produces two data outputs in two consecutive cycles, with 
DRDY or DERR active for both cycles. 

3. Signal EXCP is asserted in the presence of unmasked exception. 
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SWITCHING WAVEFORMS (continued) 



Am29027 



CLK 



Transaction y ^ Note 1 
Request 



F31— Fq 



DRDY 



DERR 



EXCP 




X 



Start of Operation 

JfT^^TK x"^^-x K^r\ J^^ 

X X Note 2 
21,22,23 



VoH -0.5 V 



Vol +0.5V 



T^T^J^'^^ 



Operation Timing for Flow-Through IVIode, DRDY, DERR Advanced 
(IVIode Register Bit AD =1) 



Notes: 1 . Transaction request Write Operand R; Write Operand S; Write Operands R, S; or Write Instruction with Signal 
DREQTo asserted. 

2. Transaction Request Read Result MSBs, Read Result LSBs, Read Flags, Read Status, or Save Stat e. If re- 
qu est Re ad Result LSBs is issued, the Am29027 produces two data outputs in consecutive cycles, with DRDY 
or DERR active for both cycles. 

3. Signal EXCP is asserted in the presence of an unmasked exception. 



1-167 



29K Family CMOS Devices 



SWITCHING WAVEFORMS (continued) 



Start of Operation 



\ 



CLK 



tK-^V \ / \ A ^\ \ V - 



Transaction A Note 1 
Request 



F„-Fo 



DRDY. DERR 



EXCP 



-/f- 



4_13_^ 



/ 



1.5 V 



24, 
25 



\^ 



26 
4 — ► 



20 



^C VoH-0. 5V 
T Vol +0.5 V 



24, 
25 



TT 



^l.5V 



\^ 



(Note 3) 



Operation Timing for Pipeline Mode 



Notes: 1 . Transaction request Write Operand R; Write Operand S; Write Operands R, S; or Write Instruction with signal 
DREQTo asserted. 
2. Transaction Request Read Result MSBs, Read Result LSBs, Read Flags, Read Status, or Save Stat e. If re- 
qu est Re ad Result LSBs is issued, the Am29027 produces two data outputs in consecutive cycles, with DRDY 
or DERR for both cycles. 



3. Signal EXCP is asserted in the presence of an unmasked exception. 



CLK 



\ / \ ^r^v \ /.5v 



f\^aster/Slave Discrepancy 
During This Cycle 



27 



MSERR 



27 



/ 



Xlil 



Master/Siave Timing 
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SWITCHING TEST CIRCUIT 



Am29027 



VoUT O- 



9 ♦ 



Cu = 5 pF 



n 



R2= IK 



9 Vc 



Ri =300 ohms 



Three-State Output Inactive Test 



Vref = 1.5 V 




loL = 4.0 mA 



X 



Cl 



Am29027 
Pin Under Test 



loH = 4.0 mA 



09075 BO0 1 A 



Cl is guaranteed to 80 pF. 
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TEST PHILOSOPHY AND METHODS 

The following nine points describe AMD's philosophy for 
high-volume, high-speed automatic testing. 

1 . Ensure that the part is adequately decoupled at the 
test head. Large changes in Vcc current as the de- 
vice switches may cause erroneous function fail- 
ures due to Vcc changes. 

2. Do not leave inputs floating during any tests, as they 
may start to oscillate at high frequency. 

3. Do not attempt to perform threshold tests at high 
speed. Following an output transition, ground cur- 
rent may change by as much as 400 mA in 5-8 ns. 
Inductance in the ground cable may allow the 
ground pin at the device to rise by hundreds of mil- 
livolts momentarily. 

4. Use extreme care in defining point input levels for 
AC tests. Many inputs may be changed at once, so 
there will be significant noise at the device pins and 
they may not actually reach Vi or Vih until the noise 
has settled. AMD recommends using Vil< V and 
ViH>3.0V for AC tests. 

5. To simplify failure analysis, programs should be de- 
signed to perform DC, Function, and AC tests as 
three distinct groups of tests. 

6. Capacitive Loading for AC Testing. 

Automatic testers and their associated hardware 
have stray capacitance that varies from one type of 
tester to another, but is generally around 50 pF. 
This, of course, makes it impossible to make direct 
measurements of parameters that call for smaller 
capacitive load than the associated stray capaci- 
tance. Typical examples of this are the so-called 
float delays, which measure the propagation delays 
into the high-impedance state and are usually 
specified at a load capacitance of 5.0 pF. In these 
cases, the test is performed at the higher load ca- 
pacitance (typically 50 pF), and engineering corre- 
lations based on data taken with a bench setup are 
used to predict the result at the lower capacitance. 

Similarly, a product may be specified at more than 
one capacitive load. Since the typical automatic 



tester is not capable of switching loads in mid-test, it 
is impossible to make measurements at both ca- 
pacitances even though they may both be greater 
than the stray capacitance. In these cases, a mea- 
surement is made at one of the two capacitances. 
The result at the other capacitance is predicted from 
engineering correlations based on data taken with a 
bench setup and the knowledge that certain DC 
measurements (Ioh, Iol, for example) have already 
been taken and are within spec. In some cases, 
special DC tests are performed in order to facilitate 
this correlation. 

7. Threshold Testing 

The noise associated with automatic testing (due to 
the long, inductive cables) and the high gain of the 
tested device when in the vicinity of the actual de- 
vice threshold, frequently give rise to oscillations 
when testing high-speed circuits. These oscillations 
are not indicative of a reject device, but instead of an 
overtaxed test system. To minimize this problem, 
thresholds are tested at least once for each input 
pin. Thereafter, hard high and low levels are used 
for other tests. Generally this means that function 
and AC testing are performed at hard input levels 
rather than at Vil Max. and Vih Min. 

8. AC Testing 

Occasionally, parameters are specified that cannot 
be measured directly on automatic testers because 
of tester limitations. Data input hold times often fall 
into this category. In these cases, the parameter 
in question is guaranteed by correlating these tests 
with other AC tests that have been performed. 
These correlations are arrived at by the cognizant 
engineer by using precise bench measurements in 
conjunction with the knowledge that certain DC 
parameters have already been measured and are 
within spec. 

In some cases, certain AC tests are redundant, 
since they can be shown to be predicted by some 
other tests that have already been performed. In 
these cases, the redundant tests are not performed. 
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Am29027 Thermal Characteristics 
Pin-Grid-Array Package 



i 








T 




r-— '1 











OjA = Ox + Oca 



Thermal Resistance - °CA/Vatt 





Airflow— ft./min. (m/sec) 


Parameter 



(0) 


150 
(0.76)', 


.300 
(1-53),. 


. 480 
' (2.45) 


700 
(3.58) 


900 
(4.61) 


Ox Junction-to-Case 


t-r4^/' 


\''\ 4V"'' 


r 4,:. 


'■' 4 


4 


4 


Oca Case-to-Ambient (no Heatsink) - 


•\ie,^ 


'/,/\14i ^' 


"" 12 


11 


9 


8 


Oca Case-to-Ambient (with. ompidirectionaI'4-Fin . 
Heatsink, Thermalloy 0417261) , -• ' 


10 


6 


3 


2 


2 


2 


Oca Case-to-Ambient (with unidirectional Pin Fin 
Heatsink. Wakefield 840-20) 


10 


6 


3 


2 


2 


2 



Am29027 Thermal Characteristics 
Ceramic Quad-Flat-Pacit Paci<age 



Ox I 

Oca ^^ 

9 jA = Ox + Oca 

Thermal Resistance - °CA/Vatt 





Airflow— ft./mln. (m/sec) 


Parameter 



(0) 


150 
(0.76) 


300 
(1.53) 


480 
(2.45) 


700 
(3.58) 


900 
(4.61) 


Ox Junction-to-Case 














Oca Case-to-Ambient (no Heatsink) 















Note: This is for reference only. 
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APPENDIX A— DATA FORMATS 

The following data formats are supported: 32-bit integer, 64-bit integer, IEEE single-precision, IEEE double-precision, 
DEC F, DEC D, DEC G, IBM single-precision, and IBM double-precision. 

The primary and alternate floating-point formats are selected by mode register fields PFF and AFF. The user may 
select between floating-point operations and integer operations by means of instruction bit iNs. 

The nine supported formats are described below: 

Integer Formats 

32-Bit Integer 

The 32-bit integer word is arranged as follows: 



Bit 31 30 29 28 27 26 25 • . 


7 6 5 4 


3 


2 1 





31 30 29 28 27 26 25 

-2222222 


7 6 5 4 

• 2 2 2 2 


3 

2 


2 1 
2 2 




2 



TB001030 

The 32-bit word is interpreted as a two's-complement integer. For integer multiplications, the user has the option of 
interpreting integers as unsigned. An unsigned single-precision integer has a format similar to that of the two's-com- 
plement integer, but with an MSB weight of 2^V 

64-Bit Integer 

The 64-bit integer word is arranged as follows: 

Bit 63 62 61 60 59 58 57 ..... 7 6 5 4 3 2 1 



63 62 61 60 59 58 57 

-2 2 2 2 2 2 2 



7 6 5 4 3 2 10 

2 2 2 2 2 222 



TB001040 

The 64-bit word is interpreted as a two's-complement integer. For integer multiplications, the user has the option of 
interpreting integers as unsigned. An unsigned double-precision integer has a format similarto that of the two's-com- 
plement integer, but with an MSB weight of 2®^ 

IEEE Formats 

IEEE Single Precision 

The IEEE single-precision word is 32 bits wide and is arranged in the format shown below: 



31 


30 29 28 27 26 25 24 23 


22 21 20 19 18 ■ 


.3210 


s 


7 6 5 4 3 2 10 

22222222 


-1 -2 -3 -4 -5 
2 2 2 2 2 • • 


-20 -21 -22 -23 

2 2 2 2 



sign biased exponent (e) 



fraction (f) 



TB001050 



The floating-point word is divided into three fields: a single-bit sign, an 8-bit biased exponent, and a 23-bit fraction. 

The sign bit is for positive numbers and 1 for negative numbers. may have either sign. 

The biased exponent is an 8-bit unsigned integer representing a multiplicative factor of some power of 2. The bias 
value is 1 27. If, for example, the multiplicative value for a floating-point number is to be 2^ the value of the biased 
exponent is a + 1 27, where "a" is the true exponent. 
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The fraction is a 23-bit unsigned fractional field containing the 23 least significant bits of the floating-point number's 
24-bit mantissa. The weight of the fraction's most significant bit is 2~\ The weight of the least significant bit is 2"^. 

An IEEE floating-point number is evaluated or interpreted as follows: 

if e = 255 and f ?t value = NaN Not a Number 

if e = 255 and f = value = (-1 )Soo Infinity 

If 0<e<255 value = (-1)^2»-'" (l.f) Normalized number 

If e = Oand f?i:0 value = (-1 )^2-'^ (O.f) Denormalized number 

lfe = Oandf = value = (-1)S0 Zero 

Infinity: Infinity can have either a positive or negative sign. The interpretation of infinities is determined by mode 
register bit AP. 

NaN: A NaN is interpreted as a signal or symbol. NaNs are used to indicate invalid operations and as a means of 
passing process status through a series of calculations. They arise in two ways: either generated by the Am29027 to 
indicate an invalid operation, or provided by the user as an input. A signaling NaN has the MSB of its fraction set to 
and at least one of the remaining fraction bits set to 1 . A quiet NaN has the fvlSB of its fraction set to 1 . 

The IEEE format is fully described in ANSI/IEEE Standard 754-1985. 

IEEE Double Precision 

The IEEE double-precision word is 64 bits wide and is arranged in the format shown below: 



63 


62 61 60 • 


• 54 53 52 


51 


50 49 48 47 • 


• 3 2 1 


s 


10 9 8 
2 2 2- 


2 1 
2 2 2 


2" 


2' 2-' 2" 2-' . 


-49 -50 -51 -52 

• 2 2 2 2 



sign 



biased exponent (e) 



fraction (f) 



TB001060 



The floating-point word is divided into three fields: a single-bit sign, an 1 1-bit biased exponent, and a 52-bit fraction. 
The sign bit is for positive numbers and 1 for negative numbers; may have either sign. 
The biased exponent is an 1 1-bit unsigned integer representing a multiplicative factor of some power of 2. The bias 
value is 1023. If, for example, the multiplicative value for a floating-point number is to be 2^ the value of the biased 
exponent is a -^ 1 023, where "a" is the true exponent. 

The fraction is a 52-bit unsigned fractional field containing the 52 least significant bits of the floating-point number's 
53-bit mantissa. The weight of the fraction's most significant bit is 2''. The weight of the least significant bit is 2"^^. 

An IEEE floating-point number is evaluated or interpreted as follows: 



If e = 2047 and \*0 value = Reserved operand 

lfe = 2047andf = value = (-1)s~ 

If0<e<2047 value = (-1)S2^"'2M1-f) 

Ife = OandfvtO value = (-1 )S2-"=2= (O.f) 

If e = 0andf = value = (-1)^0 



Not a Number 
Infinity 

Normalized number 
Denormalized number 
Zero 



Infinity: Infinity can have either a positive or negative sign. The interpretation of infinities is determined by mode regis- 
ter bit AP. 

NaN: A NaN is interpreted as a signal or symbol. NaNs are used to indicate invalid operations and as a means of 
passing process status through a series of calculations. They arise in two ways: either generated by the Am29027 to 
indicate an invalid operation, or provided by the user as an input. A signaling NaN has the ivlSB of its fraction set to 
and at least one of the remaining fraction bits set to 1 . A quiet NaN has the MSB of its fraction set to 1 . 

The IEEE format is fully described in ANSI/IEEE Standard 754-1985. 
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DEC Formats 

DECF 

The DEC F word is 32 bits wide and is arranged in the format shown below: 



31 30 29 28 27 26 25 24 23 22 21 20 19 18 



3 2 1 



s 


7 6 5 4 3 2 1 

2222 2222 


-2 -3-4-5-6 -21 -22 -23 -24 

2 2 22 2 •••2 2 22 



sign biased exponent (e) 



fraction (f) 



TB001070 



The floating-point word is divided into three fields: a single-bit sign, an 8-bit biased exponent, and a 23-bit fraction. 

The sign bit is for positive numbers and 1 for negative numbers; has a positive sign. 

The biased exponent is an 8-bit unsigned integer representing a multiplicative factor of some power of 2. The bias 
value is 128. If, for example, the multiplicative value for a floating-point number is to be 2^, the value of the biased 
exponent is a+ 128, where "a" is the true exponent. 

The fraction is a 23-bit unsigned fractional field containing the 23 least significant bits of the floating-point number's 
24-bit mantissa. The weight of the fraction's most significant bit is 2'^. The weight of the least significant bit is 2"^'*. 

A DEC F floating-point number is evaluated or interpreted as follows: 

IfevtO valuevi(-1f2-'»'{0.1f) 

If s = and e = value = 

If 3 = 1 and e = value = DEC-Reserved Operand 

DEC-Reserved Operand: A DEC-Reserved Operand is interpreted as a signal or symbol. DEC-Reserved Operands 
are used to indicate invalid operations and operations whose results have overflowed the destination format. They 
may also be used to pass symbolic information from one calculation to another. 

The DEC formats are fully described in the VAX^" Architecture l^anual. 

DECD 

The DEC D word is 64 bits wide and is arranged in the format shown below: 



63 


62 61 60 59 58 57 56 55 


54 53 52 51 50 • 


•3 2 10 


s 


7 6 5 4 3 2 1 

2 2222222 


-2 -3-4-5-6 
2 2 2 2 2 


-53 -54 -55 -56 

• 2 2 2 2 



sign 



biased exponent (e) 



fraction (f) 



TB001080 



The floating-point word is divided into three fields: a single-bit sign, an 8-bit biased exponent, and a 55-bit fraction. 

The sign bit is for positive numbers and 1 for negative numbers; has a positive sign. 

The biased exponent is an 8-bit unsigned integer representing a multiplicative factor of some power of 2. The bias 
value is 128. If, for example, the multiplicative value for a floating-point number is to be 2', the value of the biased 
exponent is a + 1 28, where "a" is the true exponent. 

The fraction is a 55-bit unsigned fractional field containing the 55 least significant bits of the floating-point number's 
56-bit mantissa. The weight of the fraction's most significant bit is 2'^. The weight of the least significant bit is 2'^. 

A DEC D floating-point number is evaluated or interpreted as follows: 

Ife^O value = (-1)S2»-'2' (0.1 f) 

lfs = 0ande = value = 

If s = 1 and e = value = DEC-Reserved Operand 

DEC-Reserved Operand: A DEC-Reserved Operand is interpreted as a signal or symbol. DEC-Reserved Operands 
are used to indicate invalid operations and operations whose results have overflowed the destination format. They 
may also be used to pass symbolic information from one calculation to another. 

The DEC formats are fully described in the VAX Architecture Manual. 
1-174 



DECG 

The DEC G word is 64 bits wide and is arranged in the format shown below: 



Am29027 



63 


62 61 60 ■ 


• 54 53 52 


51 50 49 48 47 ■ 


• 3 2 1 


s 


2^°2^2« • 


■ 2' 2' 2° 


0-2 2"^ 2"^ 2~^ 2"^ 


• 2-=°2-=^2-=22-" 



sign 



biased exponent (e) 



fraction (f) 



TB001090 



The floating-point word is divided into three fields: a single-bit sign, an 1 1-bit biased exponent, and a 52-bit fraction. 

The sign bit is for positive numbers and 1 for negative numbers; has a positive sign. 

The biased exponent is an 1 1-bit unsigned integer representing a multiplicative factor of some power of 2. The bias 
value is 1024. If, for example, the multiplicative value for a floating-point number is to be 2*. the value of the biased 
exponent is a + 1024, where "a" is the true exponent. 

The fraction is a 52-bit unsigned fractional field containing the 52 least significant bits of the floating-point number's 
53-bit mantissa. The weight of the fraction's most significant bit is 2'^. The weight of the least significant bit is 2"*^ 

A DEC G floating-point number is evaluated or interpreted as follows: 

If Q^O value = {-1)^2-'°" (O.lf) 

If s = and e = ....... value = 

lfs = 1andex=o ....... value = DEC-Reserved Operand 

DEC-Reserved Operand: A DEC-Reserved Operand is interpreted as a signal or symbol. DEC-Reserved Operands 
are used to indicate invalid operations and operations whose results have overflowed the destination format. They 
may also be used to pass symbolic information from one calculation to another. 

The DEC formats are fully described in the VAX Architecture l^anual. 

IBM Formats 

IBM Single Precision 

The IBM single-precision word is 32 bits wide and is arranged in the format shown below: 



31 


30 29 28 27 26 25 24 


23 22 21 20 19 18 ■ 


• 3 2 10 


s 


26 262* 2^2^ 2V2° 


o~l o~2 p~3 p— 4 p— 5 Q~^ 


■ 2-^^ 2-^^ 2-^^ 2-2" 



sign 



biased exponent (e) 



fraction (f) 



TB001080 



The floating-point word is divided into three fields: a single-bit sign, a 7-bit biased exponent, and a 24-bit fraction. 

The sign bit is for positive numbers and 1 for negative numbers; a true has a positive sign. 

The biased exponent is a 7-bit unsigned integer representing a multiplicative factor of some power of 16. The bias 
value is 64. If, for example, the multiplicative value for a floating-point number is to be 16^, the value of the biased 
exponent is a + 64, where "a" is the true exponent. 

The fraction is a 24-bit unsigned fractional field containing the 24 least significant bits of the floating-point number's 
25-bit mantissa. The weight of the fraction's most significant bit is 2'\ The weight of the least significant bit is 2'^*. 

An IBM floating-point number is evaluated or interpreted as follows: 

Value = (-1)M6"^*(0.f) 

Zero: There are two classes of zero. If the sign, biased exponent, and fraction are all zero, the operand is known as a 
"True Zero." If the fraction is zero, but the sign and biased exponent are not both zero, the operand is known as a 
"Floating-point Zero." 
The IBM format is fully described in the IBM System/370 Principles of Operation Manual. 
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iBIVi Doubie Precision 

The IBM double-precision word is 64 bits wide and is arranged In the format shown below: 



63 


62 61 60 59 58 57 56 


55 54 53 52 51 50 • 


• 3 2 10 


s 


6 5 4 3 2 10 

2 222 2 22 


-1 -2 -3 -4-5 -6 
2 2 2 2 2 2 


-53 -54 -55 -56 
• 2 2 2 2 



sign 



biased exponent (e) 



fraction (f) 



TB00110 



The floating-point word is divided into three fields: a single-bit sign, a 7-bit biased exponent, and a 56-bit fraction. 

The sign bit is for positive numbers and 1 for negative numbers; a true has a positive sign. 

The biased exponent is a 7-bit unsigned integer representing a multiplicative factor of some power of 16. The bias 
value is 64. If, for example, the multiplicative value for a floating-point number is to be 16', the value of the biased 
exponent is a + 64, where "a" is the true exponent. 

The fraction is a 56-bit unsigned fractional field containing the 56 least significant bits of the floating-point number's 
57-bit mantissa. The weight of the fraction's most significant bit is 2'^ . The weight of the least significant bit is 2r^. An 
IBM floating-point number is evaluated or interpreted as follows: 

Value = (-1)M6'^'(0.f) 

Zero: There are two classes of zero. If the sign, biased exponent, and fraction are all zero, the operand is known as a 
"True Zero." If the fraction is zero, but the sign and biased exponent are not both zero, the operand is known as a 
"Floating-point Zero." 

The IBM format is fully described in the IBM System/370 Principles of Operation Manual. 
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APPENDIX B— ROUNDING MODES 

The round mode is selected by mode register field RMS as follows: 



RMS 


Round Mode 


000 


Round to Nearest (IEEE) 


001 


Round to Minus Infinity (IEEE) 


010 


Round to Plus Infinity (IEEE) 


011 


Round to Zero (IEEE) 


100 


Round to Nearest (DEC) 


101 


Round Away from Zero 


11X 


Illegal Value 



Round to Nearest (IEEE) 

The infinitely precise result of an operation is rounded to the closest representable value in the destination format. If 
the infinitely precise result is exactly halfway between two representations, it is rounded to the representation having 
a least significant bit of 0. 

Round to Minus Infinity (IEEE) 

The infinitely precise result of an operation is rounded to theclosest representable value in the destination format that 
is less than or equal to the infinitely precise result. 

Round to Plus Infinity (IEEE) 

The infinitely precise result of an operation is rounded to the closest representable value in the destination format that 
is greater than or equal to the infinitely precise result. 

Round to Zero (IEEE) 

The infinitely precise result of an operation is rounded to the closest representable value in the destination format 
whose magnitude is less than or equal to the infinitely precise result. 

Round to Nearest (DEC) 

The infinitely precise result of an operation is rounded to the closest representable value in the destination format. If 
the infinitely precise result is exactly halfway between two representations, it is rounded to the representation having 
the greater magnitude. 

Round Away from Zero 

The infinitely precise result of an operation is rounded to the closest representable value in the destination format 
whose magnitude is greater than or equal to the infinitely precise result. 

A graphical representation of these round modes is shown in Figures B1 and B2. 

The IEEE standard specifies that all four "IEEE" modes be available so that the user may select the mode most 
appropriate for the algorithm being executed. The DEC standard specifies that two rounding modes be available — 
Round-to-Nearest (DEC) and Round-to-Zero. The IBM standard specifies that all operations be performed using the 
Round-to-Zero mode. 

It should be noted, however, that the Am29027 permits any of the supported rounding modes to be selected, regard- 
less of the formal of the operation. It is permissible to use one of the IEEE rounding modes with an IBM operation, or 
DEC rounding with an IEEE operation, or any other possible combination. For those integer operations where round- 
ing is performed, any rounding mode may be chosen. This flexibility allows the user to select the mode most appropri- 
ate for the arithmetic environment in which the processor is operating. 
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Figure B1. Graphical Interpretation of Round-to-Nearest (Unbiased), Round-to-Minus-infinity, 
and Round-to-Plus-lnfinity Rounding Modes 
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Figure B2. Graphical Interpretation of Round-to-Zero, Rcund-to-Nearest (DEC), 
and Round-Av/ay-from-Zero Rounding Modes 
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APPENDIX C— ADDITIONAL OPERATION DETAILS 

There are several cases in which the implementation of the IEEE, DEC, and IBM floating-point standards in the 
Am29C327 differs from the formal definitions of those standards. This appendix describes these differences. 

Differences Between Floating-Point Arithmetic and Am29027 IEEE Operation 

Section 7.3 of the IEEE-754 standard specifies that 'Trapped overflow on conversion from a binary floating-point for- 
mat shall deliver to the trap handler a result in that or a wider format, possibly with the exponent bias adjusted, but 
rounded to the destination's precision." 

According to the IEEE standard, then, if a double-to-single IEEE operation overflows while traps are enabled, the 
result is a double-precision operand, rounded to single-precision width (23-bit fraction), together with a correctly ad- 
justed (double-precision) exponent and the appropriate flags for a trapped overflow. 

In the case of an overflow in any IEEE operation, the Am29027 returns a result in the destination format specified by 
the user, rounded to that destination format. 

In the case of the double-to-single overflow described above, the result from the Am29027 is a single-precision oper- 
and, together with a correctly adjusted (single-precision) exponent and the appropriate flags for a trapped overflow. 

A simple example serves to illustrate the discrepancy by describing the conversion of the double-precision IEEE num- 
ber 52B123456789ABCD to single-precision, with traps enabled, and the round-to-nearest rounding mode selected. 
This number is too large to be represented in single-precision format. 

According to the IEEE standard, the result of this operation is the double-precision number 52B1 234560000000, com- 
prising the double-precision exponent of the input and a fraction truncated to 23 bits, together with flags V andX. 

When the operation is performed in the Am29027, however, using the F'= P' operation with appropriate precision 
controls, the result is the single-precision number 75891 A2B, comprising the single-precision (overflowed) exponent 
reduced by 1 92 (decimal) and a single-precision fraction, together with flags V and X. 

It should be noted that trapped operation is an optional part of the IEEE standard. Full adherence to the IEEE specifi- 
cation of trapped operation is therefore not necessary to ensure compliance with IEEE-754. 

Differences Between DEC Floating-Point Aritlnmetic and Am29027 DEC Operation 

The DEC F, DEC D, and DEC G standards, as implemented in the Am29027, differ from the implementations in a VAX 
only in the way in which the subfields of the floating-point word are arranged. The differences are listed in Table C1 . 



Table 01. Differences in Am29027 and DEC Floating-Point Formats 





Am29027 Arrangement 


VAX Arrangement 


DECF 


sign: 

exponent: 

fraction: 


bit 31 

bits 30-23 
bits 22-0 


sign: 

exponent: 

fraction: 


bit 15 
bits 14-7 
bits 6-0, 
bits 31-16 


DECD 


sign: 

exponent: 

fraction: 


bit 63 
bits 62-55 
bits 54-0 


sign: 

exponent: 

fraction: 


bit 15 
bits 14-7 
bits 6-0, 
bits 31-16, 
bits 47-32, 
bits 63-48 


DEGG 


sign: 

exponent: 

fraction: 


bit 63 
bits 62-52 
bits 51-0 


sign: 

exponent: 

fraction: 


bit 15 
bits 14-4 
bits 3-0, 
bits 31-16. 
bits 47-32. 
bits 63-48 
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Am29027 

Differences Between IBM 370 Floating-Point Arithmetic and Am29027 IBM Operation 

The Am29027's deviations from the IBM standard may be summarized as follows, assuming that the user has se- 
lected the round-to-nearest rounding mode: 

1 . The Am29027 provides more guard bits in its internal format than specified by the IBI^ standard. With certain 
combinations of input operands, the Am29027 produces more accurate results than a standard IBM processor for 
instructions based on addition operations and comparisons. 

2. The discrepancies are much largerfor single-precision operations than double-precisionoperations, because the 
difference in the number of guard bits is much greater (33 more for single, one more for double). 

3. There is no universal rule for determining whether a given set of input operands will result in a discrepancy. Pro 
vided the conditions in (1 ) above are met, the user must examine each operation on a case-by-case basis, taking 
into account the input operands and the internal formats discussed in this section. 

4. The Am29027 does not produce unnormalized results from additions. The results of ail addition operations are 
renormalized. Am29027 internal formats are compared with IBM internal formats in Figure CI . 
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Figure CI. Differences in Internal iVIantissa Formats of an IBIV! CPU and the Am29027 
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APPENDIX D— TRANSACTION REQUEST/OPERATION TIMING 
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Note: Signals A31-A0 and D31-D0 are the Am29000 address and data buses, respectively. 

Figure D1. Timing for the Write Operand R, Write Operand S, Write Operands R, 
S, and Write instruction Transaction Requests 
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Figure D2. Timing for tlie Write IVIode, Write Status, and Write Register File Precisions 

Transaction Requests 
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Figure D3. Timing for tlie Advance Temp, Registers Transaction Request 
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Figure D4. Timing for the Read Result LSBs Transaction Request, No Unmasked Exceptions 
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Figure D5. Timing for Read Result LSBs Transaction Request, 
Unmasked Exception Present 
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Figure D6. Timing for Read Resuit iVISBs, Read Flags, and Read Status Transaction Requests 
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b. Second Save State Request Issued Tv/o or More Cycles 
after First Request 



Figure D7. Timing for the Save State Transaction Request, 64-Bit Resources (Registers R, R-Temp, S, 
S-Temp; Register File Locations RF7-RF0: IVIode Register) 
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Figure D8. Timing for the Save State Transaction Request, 32-Blt Resources (Instruction Register, 
Register l-Temp, Status Register, Precision Register) 
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Figure D9. Typical Timing for Single-Precision Operation in Flow-Through Mode — Perform the Operation 
A PLUS B, Read the Result; Mode Register Field PLTC= 6 
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Signals Ast-Ao and Dst-Do are the Am29000 address and data buses, respectively. 

Figure D10. Typical Timing for the Double-Precision Operation In Flow-Through Mode — Perform the 
Operation A PLUS B, Read the Result; Mode Register Field PLTC = 6 
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Signals Aai-Ao and D31-D0 are the Am29000 address and data buses, respectively. 

Figure D11. Typical Timing for Single-Precision Operation in Flow-Through Mode, with Unmasked 
Exception Present— Perform the Operation A PLUS B, Read the Result; Mode Register Field PLTC=6 
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Signals Ajt-Ao and D31-D0 are the Am29000 address and data buses, respectively. 

Figure D12. Typical Timing for Double-Precision Operation in Flow-Tiirough Mode, with Unmaslced 
Exception Present— Perform tiie Operation A PLUS B, Read the Result; Mode Register Field PLTC = 6 



CLK 



Request 

A3,-A^ 
D3,-Do 

DREQTo 



Operation in Progress 
6 Cycles 



njiruuinj^'m^LrLn.^ 



Transaction -/wRsY WI ) ( 

Reauest N /N / V 



RM 



A,B XiNST 



> 



CDA 



DRDY 



DERR 



Notes: WRS = Write Operands R, S 
RM = Read MSBs 
INST = Addition Instruction 



\J 



WI = Write Instruction 
A, B = Operands A, B 
RES = Result 



09114-029C 



Signals A31-A0 and D31-D0 are the Am29000 address and data buses, respectively. 



Figure D13. Typical Timing for Single-Precision Operation in Flow-Through Mode, with DRDY 
Advanced— Perform the Operation A PLUS B, Read the Result; Mode Register Field PLTC = 6 
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Signals A3,-Ao and Dai-Do are the Am29000 address and data buses, respectively. 

Figure D14. Typical Timing for Double-Precision Operation in Flow-Through Mode, with DRD 
Advanced— Perform the Operation A PLUS B, Read the Result; Mode Register Field PLTC=6 
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Signals A31-A0 and D31-D0 are the Am29000 address and data buses, respectively. 



Figure D15. Typical Timing for Single-Precision Operation in Flow-Through Mode, with DRDY Advanced 

and Unmasked Exception Present— Perform the Operation A PLUS B, Read the Result; 

Mode Register Field PLTC = 6 
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Signals Aat-Ao and D31-D0 are the Am29000 address and data buses, respectively. 



Figure D16. Typical Timing for Double-Precision Operation in Flow-Through IVIode, with DRDY Advanced 

and Unmasl<ed Exception Present— Perform the Operation A PLUS B, Read the Result; 

Mode Register Field PLTC = 6 
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Signals A31-A0 and D31-D0 are the Am29000 address and data buses, respectively. 

Figure D17. Typical Timing for Overlapped Single-Precision Operations In Flow-Through Mode; Perform 
the Compound Operation (A PLUS B) x by Performing Operations: (1 ) RFo <- A PLUS B, (2) RFo x 

Mode Register Field PLTC = 6 
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Figure D18. Typical Timing for Overlapped Double-Precision Operations in Flow-Through Mode; 

Perform the Compound Operation (A PLUS B) x C by Performing Operations: 

(1) RFo<- A PLUS B, (2) RFox C; Mode Register Field PLTC = 6 

Mode Register Field PLTC = 6 
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Figure D19. Typical Timing for Single-Precision Operations in Pipeline Mode; 

Perform a Series of Addition Operations A PLUS B,C PLUS D, 

E PLUS F, ... Mode Register Field PLTC = 3 
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Figure D20. Typical Timing for Double-Precision Operations in Pipeline Mode; 

Perform a Series of Addition Operations A PLUS B, C PLUS D, 

E PLUS F, . . . IVIode Register Field PLTC = 3 
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DISTINCTIVE CHARACTERISTICS 

■ Relocatable Macro Assembler supports com- 
plete Am29000^*' microprocessor instruction 
set. 

■ Linker/Loader combines separately assembled 
modules by resolving external references and 
by searching libraries. 



Librarian provides management facility for or- 
ganizing modules Into logical collections of 
functions. 

IEEE Software Floating-Point Emulation 
routines. 

Available for tlie PC-AT^", and Sun-S^" devel- 
opment environments. 



GENERAL DESCRIPTION 

Processor performance depends on the processor's 
hardware and software environment. The l<ey to maxi- 
mizing performance lies in the realization that the pro- 
cessor is part of a system that is a collection of compo- 
nents that must be integrated properly. To take 
advantage of the advanced RISC architecture of the 
Am29000 microprocessor, equally sophisticated soft- 
ware tools must be available. 

The ASM29KT" cross-development toolkit offers such 
a development environment for creating efficient and 
portable Am29000 microprocessor software. The pack- 
age consists of the assembler, the linker, the floating- 
point emulation routines, and the object module librar- 
ian. These tools allow users to design more efficient 
systems and applications than ever before. 



Cross-development \s the design of an application pro- 
gram on one computer (the rtosf system) and the execu- 
tion of that same application program on a different com- 
puter (the target system). The operating system on 
the host, such as UNIX^" or DOS, provides the tools 
needed to create the application program. These tools 
include editors for writing the source code, compilers 
and assemblers for translating the modules into exe- 
cutable code, and utilities for preparing the application 
for execution. The Am29000 microprocessor-based tar- 
get computer generally does not provide the tools re- 
quired to develop the application program. Figure 1 
shows the path that an application follows from develop- 
ment on the host system to execution on the target 
system. 
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Figure 1. Cross Software Development 



Publication # BfiY. Amendment 

10292 B /O 

Issue Date: SeptBmber 1989 

_ 



29K Family Support Tools 



The ASM29K cross-development toolkit transforms a 
PC or Sun-3 workstation host into a powerful software 
development environment. ASM29K software assem- 
bles user source and produces a relocatable object 
module. This module can be combined with other 
relocatable object modules (derived from the assembler 
or high-level language cross-compilers) using the 
ASM29K linker. Library modules prepared by the librar- 
ian can be linked in at this point as well. The resulting ab- 
solute object nrradule then can be downloaded to a tar- 
get system. 

AMD has established and published the Am29000 
microprocessor Common Object File Format (COFF) to 
which all Am29000 development tools conform. The 
AMD COFF format extends the already standard AT&T 
COFF format to support source-level debugging and 
other Am29000 microprocessor-specific features. Simi- 
larly, AMD has established a common calling conven- 



tion that maximizes perfomnance on the Am29000 
microprocessor as well as defining another standard 
for software vendors. This has led to a variety of compil- 
ers, assemblers, debuggers, and associated tools 
that may be mixed freely by developers of Am29000 
microprocessor software. 

he contents of the ASM29K cross-development toolkit 
include: 

ASM29K macro assembler 

ASM29K linker 

ASM29K librarian 

Hex utilities 

IEEE floating-point emulation routines 

Documentation 



ORDERING INFORMATION 

Licensing 

The ASM29K cross-development toolkit is licensed 
through AMD's Standard End-User Software License 
Agreement (Boxtop). This license does not require a 
signature; breaking the seal on the software envelope 
indicates acceptance of the license terms. If changes 
are required to the license agreement, they can be ar- 
ranged through your AMD sales representative. Many 
software products require the customer to provide a 
CPU ID number when ordering the product. Contact 
your sales representative if this information is not avail- 
able at the time of purchase. In addition, terms of the 
license require the customer to complete a Software 
Warranty card with the serial number and site of the host 
computer on which the software will reside. This card 
must be returned to AM D within 30 days of receipt for the 
warranty to be valid. 



Order Numbers 

The ASM29K cross-development toolkit is available for 
several different environments. Documentation can be 
ordered separately. The order number (valid combina- 
tion) is formed as a combination of: 

Product Family 

Product Category 

Product Identifier 

License Type 

Host / OS Type 

Media Type 
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ORDER INFORMATION (continued) 

AM29000 Sm ASM & ^ 



Media Type 

08 = 0.25" Sun cartridge tape, TAR format 
14 = 3.5" DSHD floppies 
21 = 9-track, 1600 BPI mag tape, TAR format 
24 = 5.25" DSHD floppies 



Host / OS Type 

07= Sun-3 
10 = PC-AT 



License Type 

B = Boxtop 
S = Signed 
"-" = Not Applicable 

Product Identifier 

ASM = ASM29K Cross-Development Toolkit 

Product Category 

SW/ = Software Product 

DC/ = Documentation Product 

MA/ = Maintenance Agreement 

Product Family 

Am29000 Microprocessor 

Valid Combinations 

Valid Combinations list configurations planned to be supported in volume for tfiis device. Consult the local AMD sales 
office to confirm availability of specific valid combinations and to check on newly released combinations. 



Order Number 



Product 



Host 



Media 



AM29000SW/ASMB0708 

AM29000SW/ASMS0708 

AM29000SW/ASMB0721 

AM29000SW/ASMS0721 

AM29000SW/ASMB1014 

AM29000SW/ASMS1014 

AM29000SW/ASMB1024 

AM29000SW/ASMS1024 

AM29000DC/ASM-99 

AM29000MA/ASM-07 

AM29000MA/ASM-10 



ASM29K Toolkit 


Sun-3 


ASM29K Toolkit 


Sun-3 


ASM29K Toolkit 


Sun-3 


ASM29K Toolkit 


Sun-3 


ASM29K Toolkit 


PC-AT 


ASM29K Toolkit 


PC-AT 


ASM29K Toolkit 


PC-AT 


ASM29K Toolkit 


PC-AT 


ASM29K Documentation 


UNIX 


ASM29K Maintenance 


Sun-3 


ASM29K Maintenance 


PC-AT 



0.25" cartridge tape, TAR format 

0.25" cartridge tape, TAR format 

9-track, 1600 BPI tape, TAR format 

9-track, 1600 BPI tape, TAR format 

3.5" DSHD floppies 

3.5" DSHD floppies 

5.25" DSHD floppies 

5.25" DSHD floppies 

Not Media Specific 

Not Media Specific 

Not Media Specific 
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FUNCTIONAL INFORMATION 

Assembler 

The ASM29K assembler converts user-written 
Am29000 assembly code into relocatable object mod- 
ules. It produces standard COFF object modules that 
can be linked with other assembled or compiled mod- 
ules. Its advanced features permit the design of well- 
structured modules that are easily maintained. 

The assembler processes Am29000 microprocessor 
instructions as defined in Chapter 8 of the Am29000 
User's Manual. Each instruction mnemonic and register 
identifier is recognized in both upper and lower case. 
Identifiers (that is, user-named variables) can have up to 
63 characters, all of which are significant. Integer, char- 
acter, string, and floating-point constants are supported 
as well as complex expression analysis. 

In addition to the Am29000 microprocessor instructions, 
the assembler supports a powerful macro facility. Pro- 
grammers can define macros with multiple parameters 
and direct macros to be repeated a specified number of 
times. Macro code is inserted into the source code at the 
position of the macro call. Macros may use local la- 
bels — labels that are visible only within the macro it- 
self—to label an instmction that can be copied several 
times throughout the program. Local labels are distin- 
guished from regular labels by using the format "$n," 
where n can be from one to six digits. 

The assembler also provides a number of directives for 
organizing the code into efficient sections or modules. 
Use of the /nc/uc/edirective merges separate files during 
assembly. The section directive assigns areas of code 
to named text, data, uninitialized memory, or initialized 
memory sections. Conditional assembly is also sup- 
ported. This useful feature allows the programmerto as- 
semble code conditionally for debugging. The assem- 
bler directives are listed in Table 1 . 

The ASM29K software also produces a cross-reference 
table for symbols. Flags allow the programmer to print 
listings that contain expanded macros, instructions not 
assembled due to conditional statements, and symbol 
tables; and to insert user-specified headers into the 
listing. 

The assembler optionally emits debug information for 
use with the XRAY29KT" source-level debugger. This 
information allows the programmer to specify the sym- 
bolic names of variables and labels during debugging 
sessions. 

The wide selection of features available in the ASM29K 
assembler gives the user the latest tools to produce 
well-structured and maintainable code. 



Linker 

The ASM29K linker integrates a group of separately 
compiled or assembled modules into a composite 
module in which all references between modules are 
resolved. It processes and produces COFF modules, 
including any module produced by a compiler in any 
language and any assembler that adheres to the AMD- 
defined COFF and calling-convention standards. Incre- 
mental linking is supported also. The ASM29K linker 
produces an extensive load map with an optional 
symbol cross-reference table. 

Object module libraries are searched with required 
modules automatically included. All code and data sec- 
tions are given absolute addresses as specified by the 
programmer. The linker provides options that create 
ROMable programs, generate warnings for possible 
undefined external references, produce a global cross- 
reference, and list defined symbols. Directives to the 
linker may be included in a file (batch mode), on the 
command line, or in combination. Programmers can use 
the ASM29K to: 

- Resolve external references between separately 
compiled or assembled modules. 

-Assign absolute addresses. 

- Direct section ordering. 

-Perform incremental linking. 

-Load only those library modules referenced for effi- 
cient code space use. 

-Generates optionally ROMable programs. 

Librarian 

The ASM29K librarian is a management facility for or- 
ganizing independently developed pieces of software 
into logical units. It permits the addition, deletion, and re- 
placement of object modules in one or more libraries. 
The ASM29K librarian: 

-Organizes and initializes modules into a library file. 

-Lists library contents and information. 

-Lists a library directory. 
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Table 1. Assembler Directives 



Group 



Directives 



Meaning 



File Processing 



Conditional Assenfibiy 



Listing Control 



Symbol Declaration 



Section Declaration 



Data Storage Declaration 



Repeat Block 



Macro Definition 



High-Level Language (HLL) Debugging 



.end 


End of Assembly 


.err 


Generate Assembly Error 


.ident 


Specify Module Name 


.include 


Include Text File 


.else 


Alternate Condition 


.endif 


End of Conditional Assembly Block 


.if 


Assemble if Value is Not Zero 


.ifdef 


Assemble if Identifer is Defined 


.ifeqs 


Assemble if Strings are Equal 


.ifnes 


Assemble if Strings are Not Equal 


.ifnotdef 


Assemble if Identifier is Not Defined 


.eject 


Advance to Top of Page 


.Iflags 


Set Listing Flags 


.list 


Enable Listing 


.nolist 


Disable Listing 


.print 


Print to Standard Output 


.sbtti 


Set the Listing Subtitle 


.space 


Space N Lines 


.title 


Set the Listing Title 


.equ 


Equate a Symbol to a Value (Unlimited Scope) 


.extern 


Declare Symbols as External to This Module 


.global 


Make Symbols Visible to Other Modules 


.reg 


Declare a Symbol as a Synonym for a Register 


.set 


Set a Symbol to a Value (Limited Scope) 


.comm 


Declare a Common Symbol 


.data 


Use the .data Section 


.dsect 


Declare a Dummy Section 


.Icomm 


Declare a Local bss Symbol 


.sect 


Declare a New Section 


.text 


Use the .text Section 


.use 


Use a Declared Section 


.align 


Specify Byte Alignment 


.ascii 


Store the String 


.block 


Reserve Bytes 


.byte 


Initialize Bytes 


.double 


Initialize Double-Precision Values 


.extend 


Initialize Extended-Precision Values 


.float 


Initialize Single-Precision Values 


.hword 


Initialize Half-Words 


.word 


Initialize Words 


.endr 


End of Repeat Block 


.irep 


Repeat for Each Item in the List 


.irepc 


Repeat for Each Character in the String 


.rep 


Repeat N Times 


.endm 


End Macro Definition 


.exitm 


Terminate Macro Expansion 


.macro 


Macro Heading 


.purgem 


Purge All Macros Listed 


.def 


Define Symbol Table Entry Directive 


.dim 


Dimensions of an Array Attribute 


.endef 


End of Symbol Definition Block Directive 


.file 


Source Filename Directive 


.line 


Source-File Line-Number Directive 


.In 


HLL Source-File Line-Number Directive 


.scl 


Storage Class of a Symbral Attribute 


.size 


Size of a Symbol Attribute 


tag 


Structure, Union, or Enumeration Identifier Attribute 


type 


Basic and Derived Type of a Symbol Attribute 


.val 


Value of a Symbol Attribute 
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Floating-Point Emulation 

The Am29000 microprocessor instruction set includes 
floating-point and integer math operations. In the cur- 
rent processor implementation, these instructions 
cause traps to routines that perform the operations. The 
user is provided with source to two complete sets of rou- 
tines that emulate IEEE Floating-Point Standard 754 for 
each of the instructions listed in Table 2. 

The first set of routines is provided for users who 
have integrated an Am29027™ arithmetic accelerator 
into their systems. The Am29000 microprocessor 



math instructions are emulated using the Am29027 
co-processor. 

The second set of routines implements emulation of the 
floating-point operations entirely in software. No special 
hardware is required. 

Documentation instructs users how to integrate the 
package into their target system. Both packages are de- 
signed to insure upward compatibility with next genera- 
tion processors. 



Table 2. Arithmetic Instructions 



Type 



Mnemonic 


Operation 


MULTIPLY 


Signed Multiply 


MULTIPLYU 


Unsigned Multiply 


DIVIDE 


Signed Divide 


DiVIDEU 


Unsigned Divide 


FADD 


Single-Precision Add 


FSUB 


Single-Precision Subtract 


FMUL 


Single-Precision Multiply 


FDIV 


Single-Precision Divide 


DADD 


Double-Precision Add 


DSUB 


Double-Precision Subtract 


DMUL 


Double-Precision Multiply 


DDIV 


Double-Precision Divide 


FEQ 


Single Compare Equal To 


DEQ 


Double Compare Equal To 


FGT 


Single Compare Greater Than 


DGT 


Double Compare Greater Than 


FGE 


Single Compare Greater Than Or Equal To 


DGE 


Double Compare Greater Than Or Equal To 


CONVERT 


Convert Data Format 



Integer Arithmetic 



Single-Precision Floating-Point Arithmetic 



Double-Precision Floating-Point Arithmetic 



Floating-Point Compare 



Data Format Conversion 



Hex Utilities 

A set of hex utilities are provided to create Hex files for 
downloading into target systems and for creating ROM 
images. These tools convert AMD standard COFF files 
into Motorola® S-Record or Tektronix® Extended Hex 
files. These hex utilities and a brief description of each 
are listed below. 

■ btoa Converts a binary file into an ASCII file. 

■ coff2hex Converts a COFF file into a hex file. 



nm29 Prints name list of a COFF file, 

romcoff Generates COFF file for ROM. 

cvcoff Translates Am29000 microprocessor 
COFF files between big endian/little 
endian hosts. 

strpcoff Strips symtxjiic information from a 
COFF file. 



Sim29 



ASM29K software architectural 
simulator. 
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WARRANTY and SUPPORT 

Software Warranty 

Software programs licensed by AMD are covered by the 
warranty and patent indemnity provisions appearing in 
AMD's standard software license forms. AMD makes no 
warranty, express, statutory, implied or by description, 
regarding the information set forth herein or regarding 
the freedom of the described software program from 
patent infringement. AMD reserves the right to modify, 
change or discontinue the availability of this software 
program at any time and without notice. 

Customer Support 
Maintenance 

Ail orderabie software products include one year of free 
Maintenance Support, which starts from the date of 
original purchase. Maintenance Support allows custom- 
ers to receive technical assistance from highly trained 
field and factory personnel, to use a call-in on-line infor- 
mation system and to receive product and documenta- 
tion updates at no additional charge. Customers may 
extend Maintenance Support in one-year increments. 
Customers can access support services by calling 
the 24-hour, toll-free 291^" Family hotline at (800) 
2929-AMD (292-9263). 

On-Llne Call-In Bulletin Board 

In addition to the support engineering staff, AMD offers 
a 24-hour on-line technical support center. The cus- 



tomer can call (800) 2929-AMD at any time to query the 
system for the latest information on a particular product: 
bug fixes, work-arounds, information on upcoming 
releases, etc. Messages may be left for the support 
engineering staff during "after hours." 

Training Classes 

AMD offers training classes for the 29K Family prod- 
ucts. These classes focus on 29K Family system design 
and implementation using the broad range of AMD soft- 
ware development tools. Customers can shorten the de- 
velopment process through extensive hands-on training 
covering a variety of topics. Contact your local AM D field 
office for more information on training classes. 

Fuslon29K Program 

AMD encourages broad-based development and sup- 
port for the Am29000 microprocessor with the 
Fusion29KT" program, a joint-effort program between 
AMD and third-party developers. Published twice a 
year, the Fusion29K program catalog reveals the 
breadth of development and system solutions for the 
29K Family, including software generation and debug 
tools; hardware development tools; executive, kernel 
and multi-useroperating systems; board-level products; 
silicon products; and more. For a copy of the Fusion29K 
program catalog, call your local AM D field sales office or 
the literature center at (800) 222-9323. 
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Preliminary 



HighC29K 

Cross-Development Toolkit, Release 2 



u 



Advanced 

Micro 

Devices 



DISTINCTIVE CHARACTERISTICS 

■ Efficient, globally optimizing C compiler tech- 
nology developed by MetaWare^", Inc. ANSI 
Standard C support and conformance verifica- 
tion (ANSI document X3J1 1/88-159, December 
7, 1988 and compile-time error checking. 

■ Compiler supports load scheduling and de- 
layed branch optimizations to promote fast 
Am29000™ microprocessor code execution. 

■ Compiler supports AiVID's Am29027^" Arithme- 
tic Accelerator. 

■ Full ANSI standard run-time library of over 100 
functions include all standard I/O routines 
(sidio). 

■ Available for the PC-AT^" and Sun-3™ develop- 
ment environments. 

■ Special library of high-performance transcen- 
dental functions. 



HlghC29KTM toolkit includes the entire 
ASM29KT" Cross-Development Toolkit. The 
ASM29K package contains: 

— Relocatable macro assembler supports com- 
plete Am29000 microprocessor instruction set. 

— Linker/loader combines separately compiled or 
assembled modules by resolving external refer- 
ences and by searching libraries. 

— Librarian provides management facility for 
organizing modules into logical collections of 
functions. 

— Full architectural simulator of the Am29000 
microprocessor with user-defined memory 
access times. Allows designers to obtain price/ 
performance statistics for their particular 
Am29000 microprocessor design. 

— IEEE software floating-point emulation func- 
tions accessible from and assembly lan- 
guage modules. 



GENERAL DESCRIPTION 

Processor performance depends on the processor's 
hardware and software environment. The key to maxi- 
mizing performance lies in the realization that the proc- 
essor is part of a system which is a collection of compo- 
nents which must be properly integrated. To take ad- 
vantage of the advanced RISC architecture of the 
Am29000 microprocessor, equally sophisticated soft- 
ware tools must be available to achieve this integration. 

The HighC29K™ Cross-Development Toolkit offers 
such a development environment for creating efficient 
and portable software for the 29K™ Family. The pack- 
age consists of the full ANSI standard, optimizing 
compiler, run-time libraries, assembler, linking loader, 
floating-point emulation, and object module librarian. 
These tools allow users to design nwre efficient sys- 
tems and applications. 

Cross-development is the design of an application pro- 
gram on one computer (the host system) and the execu- 
tion of that same application program on a different 
computer (the target system). The operating system on 
the host, such as UNIX or DOS, provides the tools 
needed to create the application program. These tools 
include editors for writing the source code, compilers 
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and assemblers for translating the modules into execut- 
able code, and utilities for preparing the application for 
execution. The Am29000-based target computer gener- 
ally does not provide the tools required to develop the 
application program. Figure 1 shows the path that an 
application follows from development on the host sys- 
tem to execution on the target system. 

The HighC29K Cross-Development Toolkit transforms 
a PC or Sun workstation host into a powerful software 
development environment. The HighC29K cross-com- 
piler generates 29K Family relocatable object modules 
which can be combined with other relocatable object 
modules derived from the assembler or HighC29K com- 
piler using the 29K Family linker/loader. Library mod- 
ules prepared by the librarian can be linked in at this 
point as well. The resulting absolute object module can 
then be downloaded to a target system. 

AIVID has established and published the 29K Family 
Common Object File Format (COFF) to which all 29K 
Family development tools conform. The AMD COFF 
format extends the already standard AT&T COFF for- 
mat to support source-level debugging and other 29K 
Family-specific features. Similarly, AMD has estab- 
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lished a common calling convention that maximizes 
performance on the 29K Family of microprocessors as 
well as defining standards for software vendors. This 
has led to a variety of compilers, assemblers, debug- 



HlghC29K 

gers, and associated tools that may be mixed freely by 
developers of 29K Family software. 

The contents of the HighC29K Cross-Development 
Toolkit include: 



HlghC29K: 
Optimizing C Compiler 
Documentation 
Function Libraries 



ASM29K (included in HighC29K Deveiopment Package): 

Relocatable Macro Assembler 

Documentation 

Architectural Simulator 

Linker/Loader 

Librarian 

IEEE Floating Point Emulation Routines 

Utilities 







Host Computer 










Target Computer 






Library 
Files 








> 
> 


, 






























€ III 

1 1 


Edit 


— > 


Source 
Code 


-^ 


Compile/ 
Assemble 


->■ 


Object 
Files 




Link 


-* 


COFF 

tile 




Load 
































< 





Via On-Board 



Am29000 
Monitoror Microprocessor 

ADAPT29K 
Debugger 



Figure 1. Cross Software Deveiopment 
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ORDERING INFORMATION 

Licensing 

The HighC29K Cross-Development Toolkit is licensed 
through AMD's Standard End-User Software License 
Agreement (Boxtop). This license does not require a 
signature; breaking the seal on the software package in- 
dicates acceptance of the license terms. If changes are 
required to the license agreement, they can be ar- 
ranged through your AMD sales representative. Many 
software products require the customer to provide a 
CPU ID number when ordering the product. Contact 
your sales representative if this information is not avail- 
able at time of purchase. In addition, terms of the li- 
cense require the customer to complete a Software 
Warranty card with the serial number and site of the 
host computer on which the development package will 
reside. This card must be returned to AMD within 30 
days of receipt for the warranty to be valid. 



Order Numbers 

The HighC29K Cross-Development Toolkit is available 
for several different environments. Documentation can 
be ordered separately. The order number (Valid Combi- 
nation) is formed as a combination of: 

• Product Family 

• Product Category 

• Product Identifier 

• License Type 

• Host/OS Type 

• Media Type 



AM29000 SW/ HCC B ## ## 



Media Type 

08 = 0.25" Sun cartridge tape, TAR format 

14 = 3.5" DSHD floppies 

21 = 9-track, 1600 BPI mag tape, TAR format 

24 = 5.25" DSHD floppies 



Host/OS Type 

07 = Sun-3 

10 = PC-AT 

99 = Not Host Specific 

License Type 

B = Boxtop 
S = Signed 
"-"= Not Applicable 

Product Identifier 

HCC = HighC29K Cross-Development Toolkit 

Product Category 

SW/ = Software Product 

DC/ = Documentation Product 

MA/ = Maintenance Agreement 

Product Family 

Am29000 Microprocessor 



2-12 



HighC29K 



Valid Combinations 



Valid Combinations list configurations planned to be supported in volume forthis device. Consult the local AMD sales 
office to confirm availability of specific valid combinations and to check on newly released combinations. 



Order Number 


Product 


Host 


Media 


AM29000SW/HCCB0708 


HighC29K Toolkit 


Sun-3 


0.25" cartridge tape, TAR format 


AM29000SW/HCCS0708 


HighC29K Toolkit 


Sun-3 


0.25" cartridge tape, TAR format 


AM29000SW/HCCB0721 


HighC29K Toolkit 


Sun-3 


9-track, 1600 BPI tape, TAR format 


AM29000SW/HCCS0721 


HighC29K Toolkit 


Sun-3 


9-track, 1600 BPI tape, TAR format 


AM29000SW/HCCB1014 


HighC29K Toolkit 


PC-AT 


3.5" DSHD floppies 


AM29000SW/HCCS1014 


HighC29K Toolkit 


PC-AT 


3.5" DSHD floppies 


AM29000SW/HCCB1024 


HighC29K Toolkit 


PC-AT 


5.25" DSHD floppies 


AM29000SW/HCCS1024 


HighC29K Toolkit 


PC-AT 


5.25" DSHD floppies 


AM29000DC/HCC-99 


HighC29K Documentation 


Not Host Specific 


Not Media Specific 


AM29000MA^CC-07 


HighiC29K Maintenance 


Sun-3 


Not Media Specific 


AM29000MA/HCC-10 


HighC29K Maintenance 


PC-AT 


Not Media Specific 



FUNCTIONAL INFORMATION 
Compiler 

The HighC29K cross-compiler supports an extended 
version of the C language designed for professional 
programmers. It includes a full ANSI implementation for 
portable applications, yet also allows user access to the 
best features of other languages such as nested func- 
tions from Pascal and named parameter association 
from Ada. Extensions to the C language also are sup- 
ported, such as range notation in case statements and 
enumerated data types. The compiler allows users to 
create re-entrant procedures and to generate efficient 
code in terms of space and execution speed. 

The HighC29K cross-compiler facilitates program de- 
velopment for dedicated or stand-alone Am29000 de- 
signs. The compiler generates optimized, sharable 
code that takes full advantage of the Am29000 instruc- 
tion set. The language contains a variety of control 
statements, data types, and predeclared procedures 
and functions that promote the development of well- 
structured programs. For example, the user may specify 
the parameter types for external functions so that the 
compiler can check that arguments are passed cor- 
rectly. 

The HighC29K cross-compiler generates 29K Family 
object modules directly. The HighC29K compiler option- 
ally generates information necessary for symbolic de- 
bugging at the C or assembly level with XRAY29K™, 
AMD's source-level debugger for the 29K Family. The 
compiler preprocessor allows the user to define macros, 
merge files into source and conditionally include or ex- 
clude code. 

Optimization 

As a highly optimizing cross-compiler, HighC29K soft- 
ware ensures the generation of fast, compact code by 
using advanced optimization techniques including com- 
mon subexpression elimination, loop invariant analysis. 



global register allocation and automatic allocation of 
variables to registers. Many of the optimizations are 
particularly effective when using the unique features of 
the Am29000 microprocessor architecture. For ex- 
ample, its large register set means passing parameters 
in registers is more effective on the Am29000 micropro- 
cessor than on any other microprocesor. Optimizations 
specifically developed for the Am29000 RISC micropro- 
cessor architecture are also performed such as load 
scheduling for maximum instruction throughput. Addi- 
tionally, the compiler makes extensive use of Am29000 
microprocessor's large register file as a stack cache to 
store frequently accessed values. The list of optimiza- 
tions performed include: 

Common subexpression elimination 

Retention/reuse of register contents 

Automatic allocation of variables to registers 

Dead code elimination and cascaded jumps 

Cross jumping (tail merging) 

Constant folding 

Switch statements optimally encoded using in-line 
branch table, binary search or linear search. 

Global flow analysis leading to removal of loop 
invariant values 

Load Scheduling 

Delayed Branch 

Several of these optimizations are explained below: 

Loop Invariant Analysis: Computations made inside 
of loops that do not change value in the loop can be 
moved outside the loop. The value is stored in a register 
for optimum access. Since an application may spend as 
much as 90% of its time executing loops, this optimiza- 
tion produces a significant gain in performance. 
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Fold Constants: Operands that are constant can often 
be folded into a single constant, or into a temporary 
value. If constants are defined at compile time, tlie 
compiler can reduce them to a single value. 

Load Scheduling: The Am29000 microprocessor sup- 
ports overlapped load and store capabilities to decrease 
delays incurred while waiting for data. The compiler 
recognizes when certain instructions can be advanced 
in the pipeline for efficient operation. 

Delayed Branch: The Am29000 microprocessor 
branch instruction is delayed by one cycle to allow the 
processor pipeline to achieve maximum throughput. 
The instruction following the branch instmction, called 
the delayed instruction is executed whether the branch 
is successful or not. In most cases, the compiler can 
easily place a useful instruction, i.e. an instruction other 
than NO-OP, as the delay instruction by reorganizing 
the code. 

Data Types 

The single addressing mode of the Am29000 micropro- 
cessor combines with high-level language implementa- 
tions to provide efficient access to ail data types. 



Data Type 


Size (Bits) 


int 


32 


long int 


32 


pointer 


32 


short int 


16 


char 


8 


float 


32 


double 


64 


unsigned 


32 


unsigned char 


8 


unsigned short 


16 


enum (default) 


32 


enum (option) 


8,16,32 



Am29027 Arithmetic Accelerator Support 

Target systems that include the Am29027 Arithmetic 
Accelerator for high-speed computations are directly 
supported through the compiler. Users may direct the 
compiler to generate in-line code to access the control 
and instruction registers of the accelerator. Versions of 
the libraries that assume direct use of the Am29027 
microprocessor are included. 

Alternatively, the user can signal the compiler to gener- 
ate Am29000 microprocessor floating-point instructions 
that are used in conjunction with the IEEE Floating- 
Point Emulation Routines to access the accelerator. 

The HighC29K Cross-Development Toolkit includes 
Af^D's entire ASM29K Cross-Development Toolkit. De- 
tails of this package are contained in the ASM29K 
Cross-Development Toolkit data sheet (order #10292). 



Function Libraries 

The HighC29K toolkit includes three different sets of 
function libraries that enhance the functionality of the 
compiler. The library sets are comprised of: 

• the ANSI standard library which provides the full set 
of functions specified by the ANSI C language stan- 
dard 

a library of routines implementing the floating-point 
environment functions specified in the IEEE-754 
standard 

a library of hand-coded transcendental functions 
optimized for use with the Am29000/Am29027 
microprocessor combination. 

Each library set contains several versions of the library 
which reflect the different possible target environments. 
The compiler driver is able to select the proper version 
of the library to use based on the compile-time options 
specified. 

ANSI Standard Library 

This library contains the full functionality specified by 
the ANSI standard for the C language (X3J1 1/88-159, 
December, 1988). At the lowest level, the library func- 
tions interface with HIP (Host Interface), a small kernel 
system defined by AMD. HIF is supported in all AMD 
products, and is defined in the HighC29K toolkit manual 
for the customer who needs to adapt to a different envi- 
ronment. 

The functions included in the ANSI Standard Library 
are: 

Mathematical Routines 



abs atan2 exp 


frexp modf 


sqrt 


acos 


ceil tabs Idexp 


pow tan 


asm 


cos 


floor log sin 


tanh atan 


cosh 


fnx)d 


loglO sinh 








Memory Allocation 








calloc free malloc 


realloc 







Standard Formated I/O 

fprintf printf sprintf vfprintf vsprint fscanf scant 
sscanf vprintf _setmode 

Standard File I/O 

fclose fopen remove setbuf tmpfile 
fflush freopen rename setvbuf tmpnam 

Character Routines 

isalnum iscntrl isgraph isprint isspace 
isxdigit toupper isalpha isdigit islower 
ispunct isupper tolower 

Character I/O Routines 

fgetc fputc getc gets putchar 

ungetc fgets fputs getchar putc puts 
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String Routines 



memchr 
_strncat 
strxfrm 
_rmemcpy 
_rstrcpy 
strcats 



strcat 

memcmp 

memcpy 

memove 

memset 



strcspn 

strchr 

strcmp 

strcoll 

strcpy 



strncpy 

strerror 

strlen 

strncat 

strncmp 



strtok 

strpbrk 

strrchr 

strspn 

strstr 



Direct I/O Routines 

fgetpos fread 
rewind 

General Routines 



fseel< fsetpos ftel! fwrite 



abort 

strtoul 

srand 

exit 

mbien 



atol 

atexit 

system 

strtod 

qsort 



getenv 

bsearcti 

atol 

wctombs 

strtol 



mbstowcs rand 
labs mbtowc 

div Idiv on- 

atof exit 

wctomb 



localtime 
difftime 



mktime 



perror setjmp 
kill longjmp 

va start feof 



Date and Time Routines 

asctime ctime gmtime 
strftime time clock 

Miscellaneous Routines 

assert terror localeconv 
signal va_end clearerr 
raise setiocale va_arg 

Floating-Point Environment Library 

The functions included in the Floating-Point Environ- 
ment Library are: 

copysign rcopysign finite 
risnan logb riogb 

remainder rremainder scalb 
runordered 



class rciass 

rfinite isnan 

nextafter rnextafter 

rscalb unordered 



Fast Transcendental Library 

This library provides special hand-coded versions of the 
standard transcendental functions. These functions are 
optimized for performance with the Am29000/Am29027 
microprocessor combination. 

The functions included are: 

atan cos exp log pow 

sin sqrt tan 

Floating-Point Emulation 

The Am29000 microprocessor's instruction set includes 
floating-point and integer math operations. In the sim- 
plest processor implementation, these instructions 
cause traps to routines that perform the operations. The 
user is provided with source to two complete sets of 
routines that emulate IEEE Floating-Point Standard 754 
for each of the instaictions listed below. 

The first set of trap handlers is provided for users who 
have integrated the Am29027 arithmetic accelerator 
into their systems. The Am29000 microprocessor math 



instructions are performed using the Am29027 micro- 
processor. 

The second set of trap handlers implements emulation 
of the floating-point operations entirely in software. No 
special hardware is required. 

Documentation instructs users how to integrate the 
package into their target system. Both packages are 
designed to insure upward compatibility with future 
generation processors. The floating-point routines are 
accessible from both the assembler and compiler. 

To eliminate the overhead incurred by using the trap 
handlers, direct code generation (in-line coding) of 
Am29027 microprocessor floating-point operations is 
an included option of the HighC29K Cross-Develop- 
ment Toolkit. 

Am29000 IVIicroprocessor Floating-Point 
Instructions 

IVInemonic Operation 



CONVERT 


Convert values between types 




Integer, Float, and Double 


FEQ 


Compare Floats Equal 


DEO 


Compare Doubles Equal 


FGT 


Compare Floats Greater Than 


DGT 


Compare Double Greater Than 


FGE 


Compare Floats Less Than 


DGE 


Compare Double Less Than 


FADD 


Float Add 


DADD 


Double Add 


FSUB 


Float Subtract 


DSUB 


Double Subtract 


FMUL 


Float Multiply 


DMUL 


Double Multiply 


FDIV 


Float Divide 


DDIV 


Double Divide 



Utilities 

A set of utilities is provided to work with the output files 
produced by the development tools. They allow the user 
to prepare output files for downloading into target sys- 
tems and to create ROM images. The utilities include: 

• coff2hex: Converts Am29000 microprocessor COFF 
files to Motorola® S-record or Extended Tektronix® 
Hex Files. 

• romcoff: Allows creation of ROM images . from 
Am29000 microprocessor COFF files. 

• cvcoff: Translates Am29000 microprocessor COFF 
files between big endian/little endian hosts. 

• strpcoff: "Strips" symbolic information from an ex- 
ecutable COFF file. 
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MAINTENANCE AND SUPPORT 

Software Warranty 

Software programs licensed by AMD are covered by ttie 
warranty and patent indemnity provisions appearing in 
AMD's standard Software License Forms. AMD mal<es 
no warranty, express, statutory, implied or by descrip- 
tion regarding the infomiation set forth herein or regard- 
ing the freedom of the described software program from 
patent infringement. AMD reserves the right to modify, 
change or discontinue the availability of this software 
program at any time and without notice. 

Support 

Customer Support 

All orderable software products include one year of free 
maintenance support, which starts from the date of 
original purchase. Maintenance support allows custom- 
ers to receive technical assistance from highly trained 
field and factory personnel, to use a call-in on-line 
information system and to receive product and docu- 
mentation updates at no additional charge. Customers 
may extend maintenance support in one-year 
increments. Customers can access suppport services 
by calling the 24-hour, toll-free 29K Family hotline at 
(800) 2929-AMD (292-9263). 

On-Une Call-in Bulletin Board 

In addition to the support engineering staff, AMD offers 
a 24-hour on-line technical support center. The cus- 
tomer can call (800) 2929-AMD at any time to query the 



system for the latest information on a particular product: 
bug fixes, work-arounds and information on up-coming 
releases. Messages may be left for the support engi- 
neering staff during "after hours." 

Training Classes 

AMD offers training classes for the 29K Family prod- 
ucts. These classes focus on 29K Family system design 
and implementation using the broad range of AMD 
software development tools. Customers can shorten 
the development process through extensive hands-on 
training covering a variety of topics. Contact your local 
AMD field sales office for more information on training 
classes. 

Fuslon29K Program 

AMD encourages broad-based development and sup- 
port for the Am29000 with the Fusion29KT" program, a 
joint-effort program between AMD and third-party 
developers. A bi-annual Fusion29K program catalog 
reveals the breadth of development and system 
solutions for the 29K Family, including software 
generation and debug tools; hardware development 
tools; executive, kernel and multi-user operating 
systems; board-level products; silicon products; and 
more. For a copy of the Fusion29K program catalog, call 
your local AMD field sales office or the literature center 
at (800) 222-9323. 
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MON29K 



Target Resident Debug Monitor 



r/ION29K 

"^ 

Advanced 

Micro 

Devices 



DISTINCTIVE CHARACTERISTICS 

■ Provides local control of an Am29000^*' micro- 
processor-based system 

■ Interfaces to the XRAY29K™ Source-Level 
Debugger 

■ Allows modification and display of memory, 
registers and I/O ports 

■ Supports modification and display of special- 
purpose registers by group 

■ Allows access to both user- and system-level 
code 

■ Supports the AMD Am29027^*' Arithmetic 
Accelerator 

■ Allows modification and display of Am29027 
microprocessor registers 



Provides eight breakpoints plus slngle- 
and multiple-Instruction stepping 

Allows selection of user-defined displays after 
each breakpoint or single step 

Provides In-line assembler and disassembler 

Supports downloading of COFF and hex files 
from remote systems 

Provided In source form (0 and Am29000 
microprocessor assembly) to simplify 
installation of I/O devices 

Offers familiar user Interface, similar to DEBUG 
on IBIVI* PC 



GENERAL DESCRIPTION 

The Target Resident Debug Monitor (MON29K™) 
resides on Am29000 microprocessor-based hardware. 
It provides all the control a designer needs to load, 
execute and debug Am29000 microprocessor 
programs. MON29K software is provided in source form 
so its I/O drivers and service routines can be modified 
easily, which allows MON29K software to be 
customized for various hardware configurations. 

MON29K software provides the ability to set 
breakpoints, to set and display memory and registers, to 
read and write I/O ports, to trace execution in single or 
multiple steps, and to download files from a remote 



host. MON29K software is controlled by either an ASCII 
terminal or a host computer connected to a serial port 
on the target system. 

MON29K software supports high-level language 
debugging through XRAY29K, the Am29000 
microprocessor source-level debugger. In addition to its 
own standard command set, the XRAY29K debugger 
supports all the MON29K software commands. 

The MON29K product includes: 

• MON29K source code 
Documentation 



Publication # 



Rev . Amendment 



10287 B /O 

Issue Date: Saptember 1989 
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ORDERING INFORMATION 



Licensing 

The MON29K Resident Monitor is licensed through 
AMD's Standard End-User Software License 
Agreement (Boxtop). This license does not require a 
signature; breaking the seal on the product package 
indicates acceptance of the license terms. If changes 
are required to the license agreement, they can be 
arranged through your AMD sales representative. Many 
software products require the customer to provide a 
CPU ID number when ordering the product. Contact 
your sales representative if this information is not 
available at the time of purchase. In addition, terms of 
the license require the customer to complete a Software 
Warranty card with the serial number and site of the 
host computer on which the resident monitor source will 
reside. This card must be returned to AMD within 30 
days of receipt for the warranty to be valid. 



Order Numbers 

MON29K software executes on Am29000 
microprocessor-based systems but Is distributed in 
machine readable source form for several hosts. Thus, 
media type is the only distinguishing characteristic 
when ordering MON29K software. Documentation can 
be ordered separately. The order number (Valid 
Combination) is formed as a combination of: 

■ Product Family 

■ Product Category 

■ Product Identifier 

■ License Type 

■ Host/OS Type 

■ Media Type 



AM29000 SW/ MON 



Media Type 



08 = 0.25" cartridge tape. TAR format 
14 = 3.5" DSHD floppies 
21 = 9-track, 1600 BPI mag tape, TAR format 
24 = 5.25" DSHD floppies 



Host/OS Type 

99 = Not Host Specific 

License Type 

B = Boxtop 
S = Signed 
"-" = Not Applicable 

Product Identifier 

MON = MON29K Target Resident Debug Monitor 

Product Category 

SW/ = Software Product 
DC/ = Documentation Product 
MA/ = Maintenance Agreement 

Product Family 

Am29000 Microprocessor 
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Valid Combinations 








Valid Combinations lists configurations planned to be supported in volume for this device. Consult the local AMD 


sales office to confirm availability of specific valid combinations and to check on newly released combinations. 


Part Number 


Product 


Host 


Media 


AM29000SW/MONB9908 


MON29K Resident Monitor 


Not Host Specific 


0.25" cartridge tape, TAR format 


AM29000SW/MONS9908 


MON29K Resident Monitor 


Not Host Specific 


0.25" cartridge tape, TAR format 


AM29000SW/MONB9914 


MON29K Resident Monitor 


Not Host Specific 


3.5" DSHD floppies 


AM29000SW/MONS9914 


MON29K Resident Monitor 


Not Host Specific 


3.5" DSHD floppies 


AM29000SW/MONB9921 


MON29K Resident Monitor 


Not Host Specific 


9-track, 1600 BPI tape, TAR format 


AM29000SW/MONS9921 


MON29K Resident Monitor 


Not Host Specific 


9-track, 1600 BPI tape, TAR format 


AM29000SW/MONB9924 


MON29K Resident Monitor 


Not Host Specific 


5.25" DSHD floppies 


AM29000SW/I^ONS9924 


MON29K Resident Monitor 


Not Host Specific 


5.25" DSHD floppies 


AM29000DC/MON-99 


MON29K Documentation 


UNIX 


Not Media Specific 


AM29000MA/MON-99 


MON29K Maintenance 


Not Host Specific 


Not Media Specific 



FUNCTIONAL DESCRIPTION 

I^ON29K software resides on the target system and 
interfaces to the user through an ASCII terminal 
connected to a serial port on the target system. AH 
commands and formatted displays are communicated 
through this serial link. MON29K software supports 
simple display formats so that compatibility can be 
maintained with any CRT. 

MON29K software provides program development 
support at the assembler source level. High-level 
source code development is provided by the XRAY29K 
debugger when it is connected to MON29K monitor. 
MON29K serves as the target resident monitor that 
interrogates memory and registers for the host-resident 
source-level debugger. 

Memory, Register and I/O Addresses 

MON29K software supports three address spaces: 
register, memory, and I/O. Data values are always 
represented in hex, as are memory and I/O addresses. 
Register addresses are represented by decimal 
numbers and grouped as general, local, global, special- 
purpose, and TLB. Special-purpose and TLB registers 
can be accessed by register number or by their 
abbreviated mnemonic. The Special-Purpose Registers 
section that follows discusses other commands for 
accessing these registers. 

Memory and I/O addresses are assumed to be real 
because MON29K software has no mechanism for 
calculating or interpreting virtual addresses. MON29K 
software allows specification of user and supervisor 
modes and specification of OPT lines with all memory 
and I/O addresses. 



Displaying Memory and Registers 

The D/sp/a/ command shows data for a specified range 
of addresses, beginning at a specified address or from 
the currently active address. Each line in the display 
contains 16 bytes of data. The 16 bytes are displayed 
as either bytes, half-words, words, single-precision, or 
double-precision floating points, depending on the 
command entered. 

Floating-point numbers are displayed in decimal format 
if the value can be represented accurately within the 
digits available. Othenwise, scientific notation, E format, 
is used. 

Following the numeric data is a string of ASCII 
characters in which each character corresponds to one 
byte of data. When no ASCII equivalent exists for the 
byte of data, a period is displayed. Figure 1 shows 
examples of memory and register displays. 

Altering Memory and Registers 

Memory and register contents can be set, filled, or 
moved. The set command allows the contents of 
registers and memory to be examined and optionally 
changed. One or more values can be set without 
examining the previous contents. The fill command sets 
a range of register or memory addresses to a specific 
value. The move command copies blocks of data from 
one range of addresses to another. Blocks in the 
destination address range may overlap blocks in the 
source address range. 
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Special-Purpose Registers 

The special-purpose register commands provide 
another method for accessing the Am29000 
microprocessor special-purpose and TLB registers. 
These registers are organized into groups: 
Unprotected, Protected, TLB Entries, and Coprocessor. 
Specific commands are used for examining the 
contents of registers in each group. Within a group, 
each register's contents can be examined or changed 
explicitly. 

The large number of registers necessitates special 
register display screens that clearly present each 
group's registers. To enhance display efficiency, the 
single command X is available. It displays the registers 
most likely to be in use: all the global registers, half the 
local registers, and all the unprotected registers. 
Figures 2 and 3 show examples of special-purpose 
register display screens. 



In-Line Assembler/Disassembler 

An in-line assembler/disassembler allows the user to 
examine and change memory using instruction 
mnemonics rather than hex values. This improves 
readability and minimizes user efforts while entering 
changes to instruction memory. The lexical conventions 
and statement syntax used are identical to the standard 
AMD assembler, ASM29K™. 

I/O Commands 

I/O commands provide simple forms of input and output. 
They are intended to allow quick examination and 
simple control of devices. These commands read or 
write a full word of data to or from a real I/O address. 



#ciw LR4, LRU 

LR004 61006200 63006400 65006600 67006800 

LR008 69006a00 6b006c00 6d006e00 6f007000 

# 

# 

# DB lOOOOl, lOOlFI 

OOOIOOOOI 61 00 62 00 63 00 64 00 65 00 66 00 67 00 68 00 a.b.c.d, 



e.f .g, 



OOOIOOIOI 69 00 6a 00 6b 00 6c 00 6d 00 6e 00 6f 00 70 00 i . j .k.l.m.n.o 




Figure 1. Register and Memory Display 



#XP 

CPS; 
OPS; 



CA IP TE TP TU FZ 





LK 


RE 


WM 


PD 


PI 


SM 


IM 


DI 


DA 

























































VAB CFG: PRL VF RV BO CP CD 
0000 01 1 1 



CHA CHD CHC: CE CNTL CR LS ML ST LA TF TR NN CV 
00000000 00000000 00 00 00 



REP: BF BE BD BC BB BA B9 B8 B7 B6 B5 B4 B3 B2 Bl BO 
000000 000000000 



TCV TR: OV IN IE TRV PCO PCI PC2 MMU: PS PID LRU 
000000 110 000000 00010004 00010000 00000000 00 




Figure 2. Protected Register Group Display 
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Downloading 

Downloading controls the transmission of data from a 
remote system to the local memory on the target 
system. MON29K software can read COFF binary, 
Motorola S3 hex records, and TEK extended hex files. 
Each of these formats contains the address and byte 
count in-formation for loading memory, so no other 
parameters need to be specified. 

An optional downloading parameter, <host command>, 
can be specified by the user. The <host command> is a 
character string that is uploaded by I^ON29K to the 
remote host system. This command can be used to 
initiate the host download procedure remotely from the 
MON29K monitor terminal. 

Execution Control 

Execution control commands allow the user to start 
program execution, setup through instruction singly or 
in groups, breakpoint execution, and specify monitor 
commands to be performed when termination occurs. 
Following each break in program execution, the 
I^ON29K monitor displays the address and 
disassembled contents of the next executable 
instruction. In addition, the user can identify registers 
and memory he wishes to view after the termination of 
each breakpoint or step command. This reduces the 
amount of information displayed to the data that is 
pertinent to the current debugging session. 

N/10N29K software provides eight "sticky" and two "non- 
sticky" breakpoints. Sticky breakpoints remain set until 
expressly removed by the user. These are useful when 
debugging code within an instruction loop. Non-sticky 
breakpoints occur once and are removed automatically. 
Non-sticky breakpoints are optional parameters of the 
go command. Users can easily display, set, and reset 
breakpoint addresses. 



Program execution can be stepped one instruction at a 
time or a group of instructions at a time. User-defined 
displays and the address and contents of the next 
executable instruction are displayed after each 
instruction step. When stepping by group, these 
displays can be delayed either until after the last 
instruction in the group is executed, or until after each 
instruction is executed. An option allows only register 
data that was changed to be displayed. This 
automatically informs the user of register changes, thus 
eliminating the need to visually monitor register 
contents. 

Remote Mode 

MON29K software supports two serial ports: one to a 
terminal and one to a host computer. In normal mode, 
either port can be used for initiating commands or for 
downloading programs, in remote mode, the two serial 
ports are linked together, allowing the terminal to 
communicate directly with the host computer. 

Miscellaneous Commands 

An on-screen help facility, as seen in Figure 4, lists all 
MON29K monitor commands. Information about a 
specific command is obtained by specifying the 
command name as a parameter to the help command. 

Am29027 Arithmetic Accelerator Support 

MON29K software is fully integrated with the AMD 
Am29027 Arithmetic Accelerator. In the same manner 
that the Am29000 microprocessor registers can be 
accessed, the Am29027 microprocessor registers can 
be both displayed and modified using MON29K 
software. An example of an Am29027 microprocessor 
register display is shown in Figure 5. 



#XT 
LINE 
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1ST REG 


: VTAG 


VE 


SR 


sw 


SE 


UR 
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UE 
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000000 
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00000 























00 
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Figure 3. TLB Entries Group Display 
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Target System Requirements 

The Am29000 microprocessor supports separate code 
and data spaces and provides no instructions for 
moving information between data and instruction 
spaces. Because of thiis, the target system must 
provide a mechanism for writing to code space in order 
for MON29K monitor to set breakpoints and load 
instruction memory. 

MON29K software is designed to support a memory, 
mapped Z8530 SCC serial device. However, source 
code is provided so the user can change the MON29K 
monitor to support other devices on a particular target 
system. 



Other Tools 

MON29K is a stand-alone product that does not depend 
on other software to function. However, MON29K 
software is delivered in source form and will need to 
be compiled with the AMD HighC29KTM Cross- 
Development Toolkit; modification may be necessary if 
compiled with other Am29000 microprocessor C 
compilers. 



# H 

Help: 

H or ? to see this display 

H<name> help with a named command 

?<name> help with a named command 

Target Resource Access: 

D - Display registers/memory 

S - Set registers /memory 

F - Fill registers/memory 

M - Move registers/memory 

A - Assemble in memory 

L - List disassembly from mem 

I - Input from port 

O - Output to port 

XU- Display/set unprotected reg 



XP- Display/set protected reg 
XT- Display/set TLB entries 
XC- Display/set Am29027 reg 
X - Display key registers 

Y - Load a file to memory 

V - Save memory to a file 

Execution Control: 

E - End execution command list 

B - Display/Set/Clear breaks 

G - Go (start execution) 

T - Trace (single/multiple step) 

Miscellaneous : 

R - Remote mode (talk to host) 

N - Normal (change 'normal' char) 

Q - Re-initialize monitor 




Figure 4. On-Screen Help Facility 



#xc 

RFO 
RF2 
RF4 
RF6 




PR 










MSW 
00000000 
00000000 
00000000 
00000000 




LSW 
00000000 
00000000 
00000000 
00000000 






RFl 
RF3 
RFS 
RF7 




PR 












MSW 
00000000 
00000000 
00000000 
00000000 




LSW 
00000000 
00000000 
00000000 
00000000 


R: 

R TEMP: 


00000000 
00000000 




00000000 
00000000 






S: 
S TE 


MP: 






00000000 
00000000 




00000000 
00000000 


F: 




00000000 




00000000 




















INSTR: 
I TEMP: 


IP 






RP RF RFS PMS 




QMS 






TMS 




SIP 




SIQ 






SIT SIF 




IF 




CO 
00 
00 


SI hi 


DUS: 


OP 



IV SV RV 



ES 



ZE XE UE 



VE 



RE 



IE 



FLAGS 


:FL6 



FL5 

1 


FL4 FL3 



FL2 FLl FLO 




OP HE AD 




MVTC MATC PLTC 




ZM XM UM VM RM IM PL RMS MF MS BU BS SU TR AP SA AFF PFF 
0000000 000000000 




Figure 5. Am29027 Register Display 
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Software Warranty 

Software programs licensed by AMD are covered by the 
warranty and patent indemnity provisions appearing in 
AMD's standard software license forms. AMD makes no 
warranty, express, statutory, implied, or by description 
regarding, the information set forth herein or regarding 
the freedom of the described software program from 
patent infringement. AMD reserves the right to modify, 
change, or discontinue the availability of this software 
program at any time and without notice. 

Customer Support 
Maintenance 

All orderable software products include one year of free 
Maintenance Support, which starts from the date of 
original purchase. Maintenance Support allows 
customers to receive technical assistance from highly 
trained field and factory personnel, to use a call-in on- 
line information system, and to receive product and 
documentation updates at no additional charge. 
Customers may extend Maintenance Support in one- 
year increments. Customers can access support 
services by calling the 24-hour, toll-free 29K™ Family 
hotline at (800) 2929-AMD (292-9263). 

On-Llne Call-in Bulletin Board 

In addition to the support engineering staff, AMD offers 
a 24-hour on-line technical support center. The 
customer can call (800) 2929-AMD at any time to query 
the system for the latest information on a particular 
product: bug fixes, work-arounds, information on up- 
coming releases, etc. Messages may be left for the 
support engineering staff during "after hours." 



MON29K 



Training Classes 

AMD offers training classes for the 29K Family 
products. These classes focus on 29K Family system 
design and implementation using the broad range of 
AMD software development tools. Customers can 
shorten the development process through extensive 
hands-on training covering a variety of topics. Contact 
your local AMD field office for more information on 
training classes. 

Fuslon29K Program 

AMD encourages broad-based development and 
support for the Am29000 microprocessor with the 
Fusion29K™ program, a joint-effort program between 
AMD and third-party developers. Published twice a 
year, the Fusion29K program catalog reveals the 
breadth of development and system solutions for the 
29K Family, including software generation and debug 
tools; hardware development tools; executive, kernel, 
and multi-user operating systems; board-level products; 
silicon products; and more. For a copy of the Fusion29K 
program catalog, call your local AMD field sales office or 
the literature center at (800) 222-9323. 
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Preliminary 



XRAY29K 

Source-Level Debugger 



^ 



Advanced 

Micro 

Devices 



DISTINCTIVE CHARACTERISTICS 

■ Supports symbolic debugging with ex- 
pressions and statements for Am29000"' 
microprocessor development environments 

■ Controls and examines program execution in 
high-level and assembiy-ievel modes 

■ Provides Interface and start-up code for the 
Am29000 microprocessor, which allows use of 
the MON29K'' Target-Resident Monitor, 
ADAPT29K'" Advanced Development and 
Protoyping Tool and PCEB29K"' PC Execution 
Board 

■ Uses window-oriented display to segregate 
debug Information in meaningful regions 



Allows single-step execution and placement of 
simple and complex breakpoints 

Supports custom screens and viewports, and 
one-key command functions 

Provides command, breakpoint, and viewport 
macros 

Supports automatic test sequences by proces- 
sing command files and logging output to a 
file 

Includes on-line help, comprehensive docu- 
mentation, and a sample debug session 



GENERAL DESCRIPTION 

AMD's XRAY29K"' source-level debugger provides 
engineers with a multiwindow interactive environment 
for debugging high-level and assembly-level software 
programs for Am29000-based systems. XRAY29K soft- 
ware resides on IBM® ATs® and compatibles, and Sun 
Workstations*'. Program execution is monitored and 
controlled in high-level source or assembly language, 
from the host system through the PCEB29K execution 
board, MON29K monitor or ADAPT29K debugger on 
the target system. Control is extensive, Including de- 
bugger commands for setting breakpoints, single step- 
ping through the program, and examining or altering 
register and memory contents. 



XRAY29K software allows examination and modifica- 
tion of a variable's contents and computation of high- 
level and assembly language expression values. Sym- 
bols can be added, displayed, and deleted in the sym- 
bol table. 

The XRAY29K product includes: 
B XRAY29K Software 

■ Documentation 

■ Install testing program 

■ Start-up code for ADAPT29K or targets using 
MON29K 



2-24 



Publication # 10626 Rev. C Amendment /O 
Issue Date: September 1989 



ORDERING INFORMATION 



XRAY29K 



Licensing 

The XRAY29K Source-Level Debugger is licensed 
through AMD's Standard End-User Software License 
Agreement (Boxtop). This license does not require a 
signature; breaking the seal on the product package in- 
dicates acceptance of the license terms. If changes are 
required to the license agreement, they can be ar- 
ranged through your AMD sales representative. Many 
software products require the customer to provide a 
CPU ID number when ordering the product. Contact 
your sales representative if this information is not avail- 
able at the time of purchase. 



Order Numbers 

The XRAY29K Source-Level Debugger is available for 
several different environments. Documentation can be 
ordered separately. The order number (Valid Combina- 
tion) is formed as a combination of: 

■ Product Family 

■ Product Category 

■ Product Identifier 

■ License Type 

■ Host /OS Type 

■ Media Type 



AM29000 



SW/ 



XRY 



## 



## 



Media Type 

08 = 0.25" Sun cartridge tape, TAR format 
14 = 3.5" DSHD floppies 
21 = 9-track, 1600 BPI mag tape. TAR format 
24 = 5.25" DSHD floppies 



Host / OS Type 

07 = Sun-3 
10 = PC- AT 



B = Boxtop 
S = Signed 
"-" = Not Applicable 

Product Identifier 

XRY= XRAY29K Source-Level Debugger 

Product Category 

SW/ = Software Product 

DC/ = Documentation Product 

MA/ = Maintenance Agreement 

Product Family 

Am29000 Microprocessor 
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Valid Combinations 

Valid Combinations list configurations planned to be supported in volume for this device. Consult tfie local AMD 
sales office to confirm availability of specific valid combinations and to check on newly released combinations. 



Order Number 



Product 



Host 



Media 



AM29000SW/XRYB0708 XRAY29K 

AM29000SW/XRYS0708 XRAY29K 

AM29000SW/XRYB0721 XRAY29K 

AM29000SW/XRYS0721 XRAY29K 

AM29000SW/XRYB1014 XRAY29K 

AM29000SW/XRYS1014 XRAY29K 

AM29000SW/XRYB1024 XRAY29K 

AM29000SW/XRYS1024 XRAY29K 

AM29000DC/XRY-99 XRAY29K 

AM29000MA/XRY-07 XRAY29K 

AM29000MA/XRY-10 XRAY29K 



Source-Level Debugger Sun-3 

Source-Level Debugger Sun-3 

Source-Level Debugger Sun-3 

Source-Level Debugger Sun-3 

Source-Level Debugger PC-AT 

Source-Level Debugger PC-AT 

Source-Level Debugger PC-AT 

Source-Level Debugger PC-AT 

Documentation UNIX 

Maintenance Sun-3 

Maintenance PC-AT 



0.25" cartridge tape, TAR format 

0.25" cartridge tape, TAR format 

9-track, 1600 BPI tape.TAR format 

9-track, 1600 BPI tape.TAR format 

3.5" DSHD floppies 

3.5" DSHD floppies 

5.25" DSHD floppies 

5.25" DSHD floppies 

Not Media Specific 

Not Media Specific 

Not Media Specific 



FUNCTIONAL DESCRIPTION 

XRAY29K software aids the control and examination of 
program execution, and can set and examine memory 
and register contents, set and remove breakpoints in 
either high-level source or assembly language code, 
and display and alter the microprocessor state. In addi- 
tion to symbolic debugging, the XRAY29K debugger's 
special features include help screens, macro capabili- 
ties, command files, conditional commands, and 
debugging through ports. For example, in batch mode, 
command files can issue directives to XRAY29K soft- 
ware to implement automated test sequences. 

XRAY29K software functions in either high-level or as- 
sembly-level mode. In high-level mode, an application 
is debugged using C language source lines to control 
and monitor execution. C variables and expressions 
replace numeric addresses for memory access. Code 
can be viewed by line number or procedure name. In 
assembly-level mode, an application is debugged using 
assembly language statements. In addition to all the ca- 
pabilities available in high-level mode, assembly-level 
mode includes machine-level register and status bit ma- 
nipulation. For each mode, the monitor's screen is parti- 
tioned in areas called viewports, where information is 
displayed in meaningful regions and is easy to identify. 



Viewport Commands 

When the XRAY29K debugger executes, the screen is 
divided in areas called viewports. The number of view- 
ports and the information shown in each depends on 
whether the object module was written in a high-level 
language (high-level mode) or assembly language (as- 
sembly-level mode). 

The standard screen for high-level mode has four view- 
ports: data, trace, code, and command. This screen is 
displayed when an object module generated by a high- 
level source program is executed. The standard screen 
for assembly-level mode has five viewports: data, stack, 
disassembled code, Am29000 microprocessor regis- 
ters, and command. This screen is displayed when an 
object module generated by an assembly language 
program is executed. Figures 1 and 2 show examples of 
these screens. 

Viewport commands control the way information is dis- 
played on the screen. Changing a viewport's size, color, 
and cursor position as well as adding and deleting a 
custom viewport are viewport commands. In addition, 
viewports can be cleared of data, and macros can be 
associated with them. Frequently used viewport com- 
mands are associated with function keys for easy 
access. 
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vactive 


Activate a viewport 


vclear 


Clear data from a viewport 


vclose 


Remove a user-defined viewport or 




screen 


vmacro 


Attach a macro to a viewport 


vopen 


Create a screen/create or resize a 




viewport 


vscreen 


Activate a screen 


vsetc 


Set a viewport's cursor position 


zoom 


Increase or decrease viewport size 



Macro Commands 

XRAY29K software supports macros to create and exe- 
cute complex command procedures, such as testing 
program variables, and to conditionally execute other 
sets of commands. Macros can be defined and used 
any time during a debugging session and can include 
comments to explain its function. The macro definition 
may contain parameters that can be changed for each 
macro call. 

Used as commands or in expressions, macros can be 
attached to a breakpoint to create complex breakpoint 
condition testing, or to a custom viewport to control data 
display. Complex initialization conditions can be repre- 
sented as a sequence of macro commands in a com- 
mand file. Statements to increment variables, perform 
loops and conditions, and control target program flow 
can be part of a macro. 

XRAY29K software provides a set of macro flow control 
statements. These statements are similar to C condi- 
tional statements (e.g., IF, ELSE, WHILE, DO, FOR, 
RETURN and CONTINUE). To create a macro, the de- 
fine command is used. After macro creation, the show 
command allows the macro's source to be viewed. 



Commands to attach a macro to a viewport are part of 
the viewport command set. Commands that attach a 
macro to a breakpoint are part of the execution and 
breakpoint command set. 

define Create a macro 
sfiow Display a macro source 

Debugger Commands 

Commands, whether in high-level source or assembly 
language mode, can be entered interactively from the 
keyboard in the command viewport or placed in a com- 
mand file and accessed as include or batch files. Some 
commands take qualifiers that provide additional infor- 
mation on how to execute the command and parame- 
ters that describe an object and communicate ad- 
dresses or file specifications. 

Breakpoints and Execution Commands 

A breakpoint causes program execution to halt or 
causes the XRAY29K debugger to take some action, 
such as incrementing a counter each time the target 
program attempts to execute an instruction at a speci- 
fied memory location. A macro can be associated with 
the breakpoint to control execution. A special break- 
point viewport shows breakpoint information during the 
debugging session, including the breakpoint identifica- 
tion number. Automatically assigned by XRAY29K soft- 
ware, the breakpoint number can reference or clear a 
breakpoint. 

Execution commands start program execution or 
re-sume execution after explicit suspension. The pro- 
gram can be instructed to continue, single step, or set 
temporary instmction breakpoints. Single stepping is 
performed by C source line in high-level mode and 
microprocessor instruction in assembly-level mode. 
In addition, for each step, a macro can be invoked. 



Data • 



Monitored Data 



■ Trace 



Routine Traceback 
Information 



Code 



Source Code 



•Status Line 
- Command ■ 



Debugger Commands 



Figure 1. Standard HIgh-Level Screen 
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Data • 



Monitored Data 



Code 



Disassembled Code 



' Status Line 
-Command - 



Debugger Commands 



Stack 



Stack Contents 



• Registers ■ 



Am29000 

Microprocessor 

Registers 



Figure 2. Standard Assembly-Level Screen 



breakinstruction Set an instruction breai<point 

clear Clear a breal<point 

go Start or continue program 

execution 
gostep Execute a macro atter each 

instnjction step 
step Execute a number of instructions 

or lines 
stepover Single step, but execute through 

procedures 

Display Commands 

Display commands write program information to a view- 
port or file about memory, expressions, or procedures. 
C source code, for example, can be listed starting at a 
particular line number or for a named procedure. Any 
active procedure— a procedure on the stack — can have 
its values displayed. 

Memory contents can be dumped in t)oth hexadecimal 
and ASCII text format, and, when in assembly-level 
mode, memory can be disassembled and displayed in 
the code viewport. Variables can be monitored and 
examined in the data viewport as the target program 
executes. An expression or expression range can be 
displayed in the command viewport according to type. 

For type conversions, scaling, and output positioning, 
display commands can open a file or device and then 
write formatted output to it. Several format options are 
provided, similar in function to those provided to C in 
standard runtime libraries. 

disassemble Display disassembled memory 

(assembly mode) 
dump Display memory contents 



expand 


Display a procedure's local 




variables 


find 


Search for a string 


fopen 


Open a file or device for writing 


fprintf 


Print formatted output to a 




viewport 


list 


Display C source code 


monitor 


Monitor expressions 


next 


Find a string's next occurrence 


nomonitor 


Discontinue monitoring an 




expression 


printf 


Print formatted output to command 




viewport 


printvalue 


Print a variable's value 



Memory and Register Commands 

To help track down problems and test fixes, memory 
and registers can be examined and altered. Two blocks 
of memory, for example, can be compared for similari- 
ties or differences to check for a corrupt RAM image. 
Memory and registers can be nrodified temporarily to 
patch programs and continue testing during a debug- 
ging session. Expression evaluation is supported dur- 
ing searching and modification. 

compare Compare two blocks of memory 

copy Copy a memory block 

fill Fill a memory block with values 

nomen Prevent access to a memory location 

search Search a memory block for a value 

setmem Change a memory address 

setreg Change a register's contents 

test Examine memory area for invalid values 
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Symbol Commands 

A symbol is a sequence of characters used to represent 
arithmetic values, memory addresses, and C variables. 
XRAY29K software knows about two types of symbols: 
program and debugger. Program symbols are symbolic 
data names or program labels that were defined during 
the source program's creation. Debugger symbols ma- 
nipulate and direct the flow of the debugger and are 
specified by the user during a debugging session. 

Symbol commands encompass both types of symbols. 
Debugger symbols can be added to the debugger sym- 
bol table, and then displayed or removed. Information 
about program symbols, such as name, data type, stor- 
age class, and memory location, can be displayed. 
add Create a symbol 

context Show the current context 

delete Delete a symt)Ol from the symbol table 

printsysbols Display symlxjl information 
scope Specify current module and procedure 

. scope 

Utility Commands 

Command files are commonly used to read macro defi- 
nitions from a file or to change viewports. After a com- 
mand file has been created, it may be included in a 
startup file and executed as if entered at the keyboard. 
When an include file error is encountered, XRAY29K 
software can be directed to quit, alaort, or continue. A log 
of commands entered at the keyboard can be retained 
and then subsequently used as a command file. If 
XRAY29K software display and execution defaults are 
changed, they can be saved in a new startup file. 
All these operations are accessed through utility 
commands. 

Other utility commands control the microprocessor's 
state. Reset simulates a microprocessor reset. Restart 
restores the microprocessor to its initial state without 
initializing memory or restarting the program, and it sets 
the program counter to the original starting address 
from the absolute file but maintains breakpoint declara- 
tions. In addition, the user can temporarily change the 
default values for debugger startup options, such as 
enabling procedure-level tracing in the trace viewport 
and intermixing C source code with assembly code in 
the code viewport. 

XRAY29K software automatically selects the correct de- 
bugging mode-based on whether the object module was 
created by the high-level compiler or the assembler. 
When a program has both kinds of object modules, a 
utility command toggles between the two modes. 

XRAY29K software includes a search facility that can 
find information in a source file and display the value of 
an expression in decimal, hexadecimal or ASCII format. 



On-line help is provided for all debugger commands, 
command arguments, and function keys, and includes a 
selection menu. 

alias Replace the name of the command 

cexpression Calculate an expression's value 
error Set include file error handling 

help Display on-line help screen 

history Recall a specifc command 

include Read in and process a command file 

journal Save all viewport commands and data 

to a file 
log Record debugger commands and 

errors in a file 
mode Select debugging mode (high or 

assembly) 
option Set debugger options for this session 

pause Pause simulation 

reset Simulate microprocessor reset 

restart Reset the program starting address 

startup Save the default startup options 

Session Control 

The debugger session can be ended at any time or can 
be paused while the host operating system environment 
is used and then entered again. This area also controls 
which object modules are loaded for debugging. 

host Temporarily enter the host environment 
load Load an object module for debugging 
quit End a debugging session 

System Requirements 

The XRAY29K software resides on the host system and 
presents the user with a friendly, high-level interface to 
the Am29000 microprocessor-based system. The soft- 
ware communicates with the host system through a se- 
rial interface to the ADAPT29K unit or a target board 
running the MON29K target-resident debug monitor, or 
a bus interface to the PCEB29K personal computer 
execution board. The MON29K software and the 
ADAPT29K unit actually perform all the Am29000 
microprocessor memory and register reads and writes 
requested by the user through XRAY29K debugger 
commands. 

Before the XRAY29K debugger can be used, an abso- 
lute object module must be created and downloaded 
into the target system RAM memory. The object module 
is created using AMD's HighC29K compiler or ASM29K 
assembler. Once generated, the object module is 
loaded into target system RAM memory by invoking the 
XRAY29K software Load command. Figure 3 illustrates 
the AMD development tool chain. 
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Software Warranty 

Software programs licensed by AMD are covered by the 
warranty and patent Indemnity provisions appearing in 
AMD's standard software license forms. AMD makes no 
warranty, express, statutory, implied, or by description 
regarding the information set forth herein or regarding 
the freedom of the described software program from 
patent infringement. AMD reserves the right to modify, 
change or discontinue the availability of this software 
program at any time and without notice. 

Customer Support 

Maintenance 

All orderable software products include one year of free 
Maintenance Support, which starts from the date of 
original purchase. Maintenance Support allows custom- 
ers to receive technical assistance from highly trained 
field and factory personnel, to use a call-in on-line infor- 
mation system and to receive product and documenta- 
tion updates at no additional charge. Customers may 
extend Maintenance Support in one-year Increments. 
Customers can access support sen/ices by calling the 
24-hour, toll-free 29K'" Family hotline at (800) 2929- 
AMD (292-9263). 

On-Une Call-in Bulletin Board 

In addition to the support engineering staff, AMD offers 
a 24-hour on-line technical support center. A customer 



can call (800) 2929-AMD at any time to query the 
system for the latest information on a particular product: 
bug fixes, work-arounds, information on upcoming re- 
leases, etc. Messages may be left for the support engi- 
neering staff during "after hours." 

Training Classes 

AMD offers training classes for the 29K Family prod- 
ucts. These classes focus on 29K Family system design 
and implementation using the broad range of AMD 
software development tools. Customers can shorten 
the development process through extensive hands-on 
training covering a variety of topics. Contact your local 
AMD field sales office for nrore information on training 
classes. 

Fuslon29K Program 

AMD encourages broad-based development and sup- 
port for the Am29000 microprocessor with the 
Fusion29K'" program, a joint-effort program between 
AMD and third-party developers. Published twice a 
year, the Fusion29K program catalog reveals the 
breadth of development and system solutions for 
the 29K Family, including software generation and 
debug tools; hardware development tools; executive, 
kernel and multi-user operating systems; tward-level 
products; silicon products; and more. For a copy of 
the Fusion29K program catalog, call your local 
AMD field sales office or the AMD literature center at 
(800) 222-9323. 
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Figure 3. AMD Development Tool Chain 
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Am29000 SYSCLK Driving 
Application Note 



by Tom Crawford 



n 



INTRODUCTION 

The purpose of this note is to describe the options of 
connecting the SYSCLK pin in an Am29000™ system. 

GENERAL CONSIDERATIONS 

SYSCLK in any Am29000 system is going to be a high- 
frequency, heavily loaded signal with strict duty factor 
requirements. The most important considerations are 
DC levels, capacitive loading, rise/fall times, high/low 
times, and transmission line effects. 

There are basically two options. One may make 
SYSCLK a source or one may make SYSCLK a desti- 
nation. 

SYSCLK AS A SOURCE 

The easiest (and the recommended) way to connect the 
clocks in the system is to have the Am29000 generate 
and drive SYSCLK. Figure 1 shows the connections. 

In this configuration, PWRCLK (pin P3) is connected 
directly to V^^. This is a power pin; it must not be just 
pulled up through a resistor. 

Two times the desired operating frequency is injected 
into INCLK. This is a TTL signal and the duty factor is 
unimportant so long as it meets the minimum High time 
and Low time parameters (see the Am29000 data 
sheet, order# 09075). 

SYSCLK is an output with CMOS levels (it swings from 
nearly ground to nearly V^^). Ail the SYSCLK relative- 
timing parameters are measured with respect to 
SYSCLK at 1.5 volts, the normal TTL '1rip point." 

Since SYSCLK must have fairly fast rise and fall times 
and may be physically long, it may behave as a trans- 
mission line (i.e., exhibit reflections). These effects can 
be minimized using a few precautions. 

If SYSCLK goes to more than one or, at most, two 
places on the board, separate traces to each destina- 
tion should be used. This minimizes the length of each 
line and minimizes the capacitive loading on each line. 
Series resistors at the source (at the Am29000) for each 
line will reduce the edge rates. Using Schottky or Fast 
logic is often preferable to CMOS logic, which lacks 
input diodes to ground. 



Before resorting to parallel termination, one should con- 
sider carefully the effects of relatively high DC loading 
on the buffer V^^^ and Vq^. 

The prudent engineer will analyze his SYSCLK signal 
with SPICE or a similar CAD package. This permits a 
prediction of the actual behavior of the circuit, which is 
essentially impossible to obtain without modeling. 

At this time, there is no guaranteed relationship be- 
tween the input on INCLK and the output on SYSCLK. 
Information on this relationship will be included in the 
Am29000 Data Sheet (order #09075). 

SYSCLK AS A DESTINATION 

SYSCLK can be driven externally. This is typically done 
to provide an extemal signal with a known phase rela- 
tionship to SYSCLK, perhaps at twice the frequency. 
Figure 2 shows the connections. 

PWRCLK and INCLK must both be connected directly 
to ground. 

SYSCLK is an input and must be driven with a CMOS- 
level clock at the operating frequency. The fact that sig- 
nals are generated from both edges of SYSCLK dic- 
tates that it be very nearly a perfect square wave (from 
1 .5 V to 1 .5 V). Perhaps the best way to generate such 
a signal is to begin with one at 2X frequency and divide 
it by two with a flip-flop. The result is buffered with one 
or more pieces of a CMOS buffer. A typical clock gen- 
erator is shown in Figure 3. 
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Figure 1. Source 
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Figure 2. Destination 



Tlie TTL oscillator operates at twice the required 
frequency. Since thie 74AC74 is edge triggered, it 
responds only to the Low-to-High transition of the 
oscillator. Its output is nominally a square wave 
(nominally because the tPHL may not be the same as 
tPLH). 

The buffer is more interesting. Clearly, it has to be 
CivlOS since SYSCLK is a CIViOS input. It has to be 
characterized to drive substantial capacitance since the 
Am29000 has an input capacitance of 90 pF. One can 
put multiple elements in parallel as long as they are in 
the same pacl<age. In addition, one can drive different 
portions of the load with different sections of the device. 



As long as they are in the same pacl<age and are simi- 
larly loaded, they will exhibit similar delays. In some 
design groups, putting buffers in parallel is a prohibited 
activity, since it is sometimes difficult to determine when 
one of the buffers has failed. Local design mies should 
always prevail. 

Tal<e, for example, the IDT 74FCT240A. With light DC 
loading, the output swings within 0.2 V of the power 
supply. At 50-pF loading, the propagation delay is 
1.5 ns minimum and 4.8 ns maximum. Putting two 
elements in parallel will solve the capacitive-loading 
situation, if it really needs to be solved. The actual 
waveforms should be examined before adding another 
buffer. The IDT data book does not distinguish between 
tPHL and tPLH. The device should be characterized at 
the actual expected loading, temperature, and voltage 
ranges to determine the actual switching char- 
acteristics. 

Tal<e, for a second example, the 74AC04. With light DC 
loading, the output swings within 0.1 V of the power 
supply. The guaranteed propagation delays for the 
74AC00 are 1 .0 ns to 7.0 ns; we expect an AC04 to be 
the same. In fact, a device actually driving an Am29000 
has measured propagation delays of tPLH = 4 and 
tPHL = 5. Two elements in parallel appear to provide a 
somewhat cleaner waveform. 
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Figure 3. Clock Generator 
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P 



by Tom Crawford 



The use of the Am29000™ has been proposed in a sys- 
tem where the instruction and data buses are con- 
nected directly to each other and to a single memory. 
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Figure 1. Block Diagram 



If the memory is very fast (single cycle), then pipelined 
or burst accesses never need to take place. Every ac- 
cess is a simple one-cycle access. Data writes would 



have to be two cycles (because BINV is valid so late). 
Presumably this would be either a fairly high-end sys- 
tem with lots of very fast memory or a cache system with 
a modest amount of SRAM backed up by lots of DRAM. 

This depends on the availability of very fast static 
RAMs. The equation below shows how to calculate the 
required access time of the RAMs. 

tMAX = tCLK - (para6 + para9A) 

For a 25-MHz device running at various clock rates: 



FREQ 


tCLK 


para6 


para9A 


tMAX 


25.00 


40 


14 


6 


20 


22.22 


45 


14 


6 


25 


20.00 


50 


14 


6 


30 


18.18 


55 


14 


6 


35 



An attempt to actually mechanize a system like this 
uncovered a problem. When the Am29000 follows an 
instruction read with a data write, there is a guaranteed 
"bus crash." 

Parameter 10 requires that the data remain on the bus 
for 2 ns after the rising edge of SYSCLK; in fact, RAM 
disable times are typically 15 ns. This means there is no 
known method to get the instruction off the instruction 
bus until as long as 15 ns after the clock rises. Addition- 
ally, in the best possible case, a PAL® delay must be 
added to allow for the use of SYSCLK to turn off the 



para 9A 




Figure 2. RAIUI Timing 
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Figures. Bus Crash 



buffers. Transceivers have a similar problem and, in 
addition, deduct from the allowable access time. 

Parameter 6A specifies the maximum delay of SYSCLK 
to write data valid. There is no minimum and, in prac- 
tice, the buffers come out of Hi-Z with the rising edge of 
SYSCLK. Since the instruction bus and data bus are 
tied together, there is an unavoidable collision. The 
memory continues to drive the common bus after the 
Am29000 begins to drive it. 

This problem does not occur in the case of data read 
followed by a data write. The Am29000 is guaranteed to 
insert an unused cycle. This provides adequate time for 
the memory to get off the data bus. 

A way to prevent this from occurring is to place a set of 
transceivers between the data bus and the instruction 
bus. 



Now the block diagram looks like this: 
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Figure 4. Buffers Added 
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The transceivers between the data bus and the instruc- 
tion bus will isolate the Am29000 data drivers from the 
RAM drivers long enough to allow the RAM drivers to go 
into high impedance. The transceivers are then turned 
on, pointed in the correct direction, and the data can be 
driven into the array. 

The instruction path has no additional delays (other 
than the added capacitance of the transceivers). It can 
still do single-cycle instruction fetches. The delay im- 
posed in the the data path certainly dictates a two-cycle 



load, unless the memory is substantially faster than 
would othenwise be necessary fo r instr uction fetches. 
Stores are not affected since the BINV comes out too 
late to allow single-cycle operations anyway. 

The buses may also be required to be connected to- 
gether when the memory must be common because of 
software requirements. With a slow mennory, the access 
time added by the insertion of a buffer is a much smaller 
percentage of the total access time. 
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Byte-Writable Memories For The Am29000 
Application Note 



z^ 



by Tom Crawford 



OVERVIEW 

This document describes how to implement a byte- 
write memory design for an Advanced Micro Devices 
Am29000™-based system. While this document will 
concentrate on the specific case of unsigned bytes, an 
analogous case exists with signed bytes and signed 
and unsigned halfwords. 

There are three important benefits that accrue from in- 
corporating a partial write capability. 

Assembly code can run faster 

The code to perform a byte write currently generated by 
the compilers and recommended for assembly-lan- 
guage programmers looks like Figure 1. 

A substantially faster way to do the same thing (given 
that the memory can write single bytes) looks like 
Figure 2. 

This is faster since the initial LOAD is avoided. In addi- 
tion, the compiler is more likely to be able to "bury" an 
isolated STORE by scheduling than t)Oth a LOAD and a 
STORE. 

Future compiler releases will support byte-write 
memories 

AMD is enhancing our compiler to optionally generate 
the byte writable code shown above. To benefit from 
these enhancements, an application's memory must be 
able to support byte writes. 



Future revisions of silicon will support byte writes 

Future Am29000 CPU products will be designed to di- 
rectly support byte writes. One approach would involve 
having the processor replicate the byte in question onto 
all byte positions. Analogous logic would have the pro- 
cessor pick the correct byte during a LOAD. The system 
design would have to be able to execute byte writes to 
take advantage of the saved cycle. 

The AMD Binary Compatibility Standard (BCS) 

AMD's BCS will assume a memory that has byte-write 
capability. Therefore, if binary compatibility is important 
to your application, your memory will need to support 
byte writes. 

WHAT MEMORY DESIGNERS MUST DO 

The bottom line to support byte-write capability is, "you 
have to be able to suppress writes to one or more 
bytes." This has two implications. The first is that some 
control signal or signals must be generated and distrib- 
uted by byte. The second is that you must choose be- 
tween suppressing the write by completely suppressing 
the memory cycle or by turning it into some kind of cycle 
other than a write. 



load 0, 17,temp,addr 



inbyte temp, teitp, data 
store 0, 17,teiTp,addr 



;load full word into register, 

;set BP to correct address. 0x11 
;in the CNTL field selects SB and 
;OPT bits corresponding to byte 

; insert byte into proper position 

; store full word into memory 
; (not byte writable) 



Figure 1. Compiler-Generated Byte-Write Code 



mtsrim bp, addr 
inbyte terrp, data, data 

store 0, 17, temp, addr 



;put address into BP 

; insert byte into proper data 

/position and low order byte 
; write a single byte. The external 
; memory looks at OPT, Ax bits 



Figure 2. Streamlined Byte-Write Code 



PuUicaiion « 
11636 

3-8 



Rev. 
A 



Amendment Issue Date 
/O 11/89 



© 1989 Advanced Micro Devices, Inc. 



Byte-Writable Memories for Atn29000 



There are four distinguishable memory configurations, 
each of which can be treated in its own way. Whether 
the devices have an explicit output enable really deter- 
mines one's choices in selecting an alternative cycle 
type. If there is not explicit output enable and the i/0 
pins are common or tied together, one must not allow a 
"complete" read or there will be a bus crash. 

Static RAMs witlx explicit output enables 

The IDT 32K by 8 CMOS device is an example of a 
static RAM with an explicit output enable. In the case of 
these devices one can arrange to suppress either the / 
Chip Select (the device will not cycle at all) or the /Write 
Enable and the /Output Enable (the device will internally 
execute a read but will not come out of high-imped- 
ance). 

Note that one cannot activate t)Olh /Chip Select and 
/Output Enable to these devices without having them 
drive their data pins. 

Static RAMs without explicit output enables 

The Toshiba 5561 64K by 1 CMOS device is an ex- 
ample of a static RAM without explicit output enables. If 
they get /Chip Enable, they will either drive their data- 
out pins or execute a write, depending on the state of 
Write Enable. 

If their data inputs are connected to their data outputs 
(typical when connected to a bi-directional bus), /Chip 
Enable must be suppressed. 



Video DRAMS (VDRAMs) with explicit output 
enable 

VDRAMs allow more choice than any other technology. 
/RAS can be suppressed, preventing the cycle alto- 
gether. /CAS can be suppressed, turning the write into a 
RAS-only refresh cycle. /WE (and /DT-OE) can be sup- 
pressed, turning the cycle into an internal read. Of the 
three, I much prefer suppressing /CAS. First, I like the 
elegance of generating a RAS-only refresh, and sec- 
ond, /CAS is easier to suppress because it is generated 
later in the cycle than /RAS or /WE, as shown in the 
code below. 

The equations in Figure 3 allow for a Byte-Order (little/ 
big endian)^ input that effectively is XORed with the ad- 
dress bits. This signal is not a pin on the Am29000. It is 
a bit in the configuration register. If this bit always is 
programmed to the same value in a given system, one 
implements only the appropriate min-terms. If the signal 
is dynamic in a system, a copy must be kept up-to-date 
in an external register. 

DRAMS without explicit output enable 

256K or 1 Meg (by one) DRAMs do not have an explicit 
output enable. Rather, if /CAS falls with /RAS low and 
/WE high, the device will enable its output buffers. This 
means having the option of suppressing the cycle alto- 
gether by suppressing /RAS, or turning it into a RAS- 
only refresh by suppressing /CAS. 256K or 1 Meg by 
four DRAMs have an explicit output enable. This makes 
them similar to the VRAM case. 



ICS31 = 
# 
# 
# 
# 

!CS23 = 
# 
« 
# 
# 

!CS15 = 
# 
* 
# 
# 

!CS07 = 
* 
# 
# 
* 



lOPTl & 
!0PT1 & 
!0PT1 & 
OPTl & 
OPTl & 
IGPTl & 
!0PT1 & 
.'OPTl & 
OPTl & 
OPTl & 
!0PT1 & 
!0PT1 & 
!0PT1 & 
OPTl & 
OPTl & 
lOPTl & 
!0PT1 & 
!0PT1 & 
OPTl & 
OPTl & 



!OPT0 

OPTO & 

OPTO & 

!OPT0 & 

!OPT0 & 

•OPTO 

OPTO & 

OPTO & 

!OPT0 & 

!OPT0 & 

!OPT0 

OPTO & 

OPTO & 

!OPT0 & 

!OPT0 & 

!OPT0 

OPTO & 

OPTO & 

!OPT0 & 

!OPT0 & 



!B0 & !A1 & 
BO & Al & 

!B0 & !A1 & 
BO & Al & 

!B0 & !A1 & 
BO & Al & 

!B0 & !A1 & 
BO & Al & 

!B0 & Al & 
80 & !A1 S 

!B0 & Al S 
BO & !A1 & 

!B0 S Al & 
BO & !A1 & 
!B0 & Al & 
BO & !A1 & 



!A0 
AO 
!A0 
!A0 

AO 
.'AO 
!A0 
!A0 

!A0 
AO 

!A0 
AO 

AO 
!A0 
!A0 

AO 



CAS 
CAS" 
CAS' 
CAS" 
CAS' 
CAS' 
CAS" 
CAS" 
CAS" 
CAS" 
CAS" 
CAS" 
CAS" 
CAS" 
CAS" 
CAS" 
CAS" 
CAS" 
CAS" 
CAS" 



TIME 
TIME 

"time 

TIME 
TIME 
TIME 
TIME 
TIME 
TIME 
TIME 
TIME 
TIME 

"time 
"time 
"time 
"time 
"time 
"time 
"time 
"time 



/*Word*/ 
/*Byte, Big*/ 
/*Byte, Little*/ 
/*HW, Big*/ 
/*HW, Little*/ 
/*Word*/ 
/*Byte, Big*/ 
/*Byte, Little*/ 
/*HW, Big*/ 
/*HW, Little*/ 
/*Word*/ 
/*Byte, Big*/ 
/*Byte, Little*/ 
/*HW, Big*/ 
/*HW, Little*/ 
/*Word*/ 
/*Byte, Big*/ 
/*Byte, Little*/ 
/*HW, Big*/ 
/*HW, Little*/ 



Figure 3. /CAS-Suppressing Code 

^ Note that all AMD 29K Family software uses big endian byte ordering only. The little endian min-terms are shown for completeness only. Always 
use big endian. 
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Am29027 Hardware Interface 
Application Note 



by Bob Perlman 



INTRODUCTION 

The Am29027T" arithmetic accelerator interfaces simply 
and efficiently to the Am29000TM streamlined instruction 
processor. The interface is designed to run at speeds in 
excess of 25 MHz; so care must be taken when connect- 
ing the two parts on a circuit board. 

This application note describes the rules to use (and the 
hazards to be aware of) when designing a 29K"* system 
containing the Am29027. 

PROCESSOR/ACCELERATOR 
INTERCONNECT 

A diagram of an Am29000/Am29027 interconnect is 
shown in Figure 1 . The interconnect contains the follow- 
ing signals: 

Control signals — Eleven signals control the transfer of 
data and instructions between the A m2900 and the 
Am29027. Eight of these sig nals, R /W, DREQ, DREQTo, 
DREQTi, OPT2-OPTo, and BINV, are generated by the 
Am29000. These specify the accelerator transaction 
reque sted b y the A m29 000. Th e three remaining sig- 
nals, CDA, DRD Y, an d DERR, are generated by the 
Am29027. The CDA signal indicates whether the 
Am29027 i s read y to accept new instructions or oper- 
ands. The DRDY and DERR signals indicate that data 
requested by the Am29000 is available on the Am29027 
output port or that an error has occurred, respectively. 

Data signals— The Am29027 R and S data input ports 
(R31-R0 and Sai-So), instruction port (bi-lo), and data 
output port (F31-F0) are connected to the Am29000 
address (A&-A31) and data (Do-Dai) buses. The 



Am29000 uses its address and data buses to transfer 
instnjctions and operands to the Am29027, and uses its 
data port to read results from the Am29027. 

Clock— The Am29027 CLK input is connected to the 
Am29027 SYSCLK pin. The SYSCLK signal can be 
generated in two ways: internal to the Am29000, by 
applying a 2X clock signal to the Am29000 INCLK input 
(as shown in Figure 1 ) ; or externally, by applying a clock 
signal to the Am29000 SYSCLK pin. 

System reset— The system reset s ignal is applied to 
the Am29000 and Am29027 RESET inputs. 

f^/lost interconnect signals ar e direc t connections. The 
only exceptions are signals DRDY and DERR, which 
must be passed through negative-logic OR gates (i.e., 
through conventional AND g ates) . Thes e gates form 
the logical OR of the DRDY and DERR signals of all 
resources on the Am29000 processor channel. The 
33k n resis tors shown connected to the CDA, DRDY, 
and DERR signals leaving the Am29027 need be pre- 
sent only if the system sometimes is operated without 
the Am29027. 



One interconnection is optional. The Am29027 EXCP 
signal, which indicates the presence of an unmasked 
arithmetic exception created by an accelerator opera- 
tion, can be connected to an Am29000 trap or interrupt 
input. This connection is necessary only if the system 
designer desires an imprecise processor interrupt in the 
presence of an accelerator exception. The Am29027 
contains internal mechanisms for recovering from 
errors; these mechanisms make the use of EXCP 
unnecessary in nrwst systems. 



Publication * Rev. Amendment Issue Date: 
12215 A /O 11/89 
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Figure 1. Am29000/Am29027 Hardware Interconnect 
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AN ALTERNATE INTERCONNECT 

In the interconnect shown in Figure 1, three Am29027 
ports are connected to the Am29000 data bus: input 
data port R, output data port F, and the instruction port. 
This places considerable capacitive loading on the 
Am29000 data bus: 1 2 pF each for input data port R and 
the instruction port, and 20 pF for output port F, for a 
total of 44 pF. 

The Am29000 data bus can drive an 80-pF load without 
derating. In systems where the 44-pF load presented to 
the data bus by an Am29027 is excessive, an alternate 
interconnect can be used, as shown in Figure 2. In this 
configuration, the Am29027 instruction bus is con- 
nected to the Am29000 address bus, rather than to the 
data bus. This interconnect more evenly distributes 
the Am29027 capacitive load between the Am29000 
address and data buses. In this configuration the 
address bus has a load of 24 pF, the data bus 32 pF. 



The alternate interconnect, shown in Figure 2, is soft- 
ware compatible with the interconnect of Figure 1 . The 
only requirement for this compatibility is that, when 
transferring an accelerator instruction from the 
Am29000 to the Am29027, the instruction must appear 
on both the Am29000 address and data buses. For ex- 
ample, an Am29000 co-processor store that transfers 
an accelerator instruction from general-purpose register 
gr96\o the accelerator instruction register must have 
the form: 

store l,CP_WRITE_INST,gr96,gr96 

Note that gr96 is specified for both the RA and RB 
instruction fields, thus ensuring that the accelerator 
instruction to be transferred is placed on both the 
address bus and the data bus. All 29K accelerator code, 
including that produced by the 29K compilers, follows 
this convention. 
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Figure 2. Alternate Am29000/Am29027 Bus Connections 
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RULES TO FOLLOW 

Even though interconnecting the Am29000 and 
Am29027 is straightforward, a few precautions must be 
taken to ensure correct accelerator operation: 



• All signals except DRDY and DERR must be direct 
connects; the signals should not pass through other 
devices. For example, if the Am29000 address bus is 
buffered before being fed to a memory array, the 
Am29027 address bus connections must be made on 
the processor side of the buffers. 



• Signals DRDY and DERR should pass through one 
(and only one) fast AND gate. The system designer 
should take care to choose high-speed AND gates; a 
74AS08. 74AS1 1 , 74AS20, or 7.5 ns PAL® device will 
suffice at 25 MHz. 

• Keep signal interconnects short. Heavily loaded 
traces may have propagation speeds on the order of 
3-4 ns/foot. All signal traces, and in particular those 



with the heaviest loading, should be kept as short as 
possible. 

• Minimize loading on the Am29000 data and address 
buses. These buses are designed to drive 80-pF loads 
without AC timing derating, and higher capacitances 
with derating. If Am29000 bus capacitances exceed 
80 pF, be sure to derate the AC parameters per the 
information provided in the Am29000 Streamlined 
Instruction Processor Data Sheet, order #09075. 

While the alternate bus connections shown in Figure 2 
will lower the capacitive loading presented to the 
Am29000 data bus, they do present a greater routing 
challenge than the connections of Figure 1 . 

WARNING: With the alternate connections of Figure 2, 
many signal lines must cross one another either under 
or near the Am29027. Before using the alternate 
connections, be sure to examine layout and routing 
requirements. 
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When is Interleaved Memory 
with the Am29000 Unnecessary? 
Application Note 



by Tom Crawford 



INTRODUCTION 

ABSTRACT 

This application note presents a graphic method of find- 
ing the maximum acceptable access time of an 
Am29000^" memory system that avoids the use of an 
interleaved memory. 



By analyzing the required access speed of memory 
devices for both interleaved and non-interleaved mem- 
ory, it is possible to determine the relative cost and 
performance for each approach. The analysis also iden- 
tifies the situations in which the system clock rate 
dictates the use of interleaved memory because suffi- 
ciently fast memory devices, needed to support a single- 
bank architecture, are unavailable. 



GENERAL 

The advantage of an interleaved memory is that slower 
and less expensive memory chips can be used. How- 
ever, the use of interleaved memory in systems that 
need only a limited amount of memory should be 
avoided, since interleaving doubles the minimum mem- 
ory size. The need to support two memory banks may 
waste a substantial amount of menvDry space and result 
in a higher system cost. 

Advanced Micro Devices is developing a complete line 
of Am29000 simulators, hardware target execution ve- 
hicles, and high-level language development tools for 
the Am29000 32-bit Streamlined Instruction Processor. 
These products are designed to support end-users who 
are building embedded system applications based on 
the Am29000 processor. For these users, often there is 
no existing operating system or kernel fortheir hardware 
design. 

The design trade-off is component count versus the 
required device access speed and density of memory. 



WHEN IS INTERLEAVING 
NECESSARY? 

Figure 1 shows a routine method of obtaining data for an 
instruction burst-mode access. (The instruction burst- 
mode access considerations discussed in this applica- 
tion note also apply to the data burst-mode access 
considerations.) 

A counter is loaded with the beginning address of the 
burst, then incremented to fetch successive words. The 
output of the counter goes through an address multi- 
plexer and then to the address inputs of the memory 
chips. The data output pins of the memory chips are 
connected directly to the Am29000 bus. 

Assuming the counter increments on the positive edge 
of SYSCLK, it is possible to calculate available time 
before the data must be valid. Figure 2 shows the avail- 
able time for a Static Column DRAM (tMAX). Any data 
buffers between the memory and the Am29000 would 
cause additional delays. 
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Figure 1. Typical IVIemory 
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Figure 2. Single-Cycle Burst 
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in order to guarantee positive margins, the following 
inequality must be satisfied: 

Equation 1 : n * tCLOCK - (tMAX + tPD_COUNT + 
tPD_MUX + tsu + tPD_WIRE) > 

The value n is the number of clock cycles available for 
memory. If there is no interleaving or wait states, n = 1 . 
For two-way interleaving, n = 2, and so on. 

The maximum column address delay (static column 
decode DRAM) that can be allowed is tMAX. The clock- 
to-output delay of the counter is tPD_COUNT. The 
value of tPD_MUX is the input-to-output delay of the 
multiplexer. The value of tSU is the setup time for 
Am29000 instructions or data. 

The value of tPD_WIRE is the propagation delay from 
the multiplexer output to the furthest memory chip input. 
This is the propagation delay per unit length of wire 
times the length of the wire. The propagation delay per 
unit length can be estimated from the equation: 

(1) 



tpd' = tpd V ( 1 + ( Cd / Co ) 



The unloaded propagation delay (\pd) is determined 
only by the txjard material dielectric constant. It is equal 
to approximately 1 .77 ns/ft. The trace capacitance (Co) 
is a function of the trace impedance and propagation 



delay and is usually taken to be approximately 18.5 
pF/ft. The distributed capacitance (Cd) resulting from 
the memory chips is calculated from the per-device in- 
put capacitance and the device spacing; assuming 5 pF 
per device and two devices per inch gives: 120 pF/ft. 

Using these numbers in the above equation yields: 
tpd' = 1 .77 V(1 +(120/18.5) = 4.84 ns/ft 

Finally, assuming that 32 devices at 24 devices per foot 
equals 1 .33 ft^ then the value fortPD_WIRE is 6.45 ns. 
These numbers are summarized in Table 1 . 

Table 1. Initial Numbers 



Name 


Value 


Obtained From 


tPD COUNT 


6.5 ns 


PAL16R8-7 


tPD MUX 


8.0 ns 


74F253 In to Zn 


tPD WIRE 


6.5 ns 


See discussion above 


tS 25 MHz 


6.0 ns 


Am29000 25MHz tSU 


tsu 20/16 MHz 


8.0 ns 


Am29000 20/16MHztSU 



Figure 3 shows the results of these values in equation 1 . 
The X-axis is tCLOCK and the y-axis is the allowable 
access time. The solid line shows the allowable access 
time for n = 1 (single-cycle operation [no interleaving]). 
The dotted line shows the allowable access time for 
n = 2. 



1 See Appendix A of the Am29000 Memory Design Handbook (order #10623) for additional information on this equation. 
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The discontinuity in the n = 1 line reflects the difference 
in tSU between 25 MHz and 20/16 MHz. The horizontal 
lines show the access times for -70, -80, and -100 
Toshiba 1M-by-1 DRAMs. The vertical lines show the 
minimum tCLOCK times for 25-, 20-, and 16-MHz 
Am29000s. The hatched area indicates where opera- 
tion is possible without interleaving. 

INITIAL RESULTS 

From inspection of Figure 3, it might be concluded that it 
is almost possible to build a single-cycle burst memory 



for a 1 6-MHz Am29000 from 'last" DRAMs with no inter- 
leaving. However, one cannot build a single-cycle burst 
memory for a 20- or 25-MHz system without interleaving 
with any available DRAM. 

Finally, using two-way interleaving, it is possible to build 
a memory that supports single-cycle bursts at a clock 
rate of 25 MHz or below, from merrwries with a column 
address access time of less than 50 ns. 
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ARE IMPROVEMENTS POSSIBLE? 

Could a system be built with single-cycle bursts without 
interleaving to run at 20 MHz? To answer this question 
graphically, move the heavy line in Figure 3 upwards 
(extending the hatched area to the left). This is done by 
reducing or eliminating the numbers, other than tMAX, 
in the inequality. These are examined below, one at a 
time. 



tPD_COUNT 

The 6.5 ns value is based on using a -7 PAL®. This is 
already faster than any 74F, 74AS, or 74ACT counter 
(or flip-flop, for that matter) in any data book this author 
has examined. 

it is certainly possible to "play games" with the clock 
scheme. SYSCLK on the Am29000 could be driven a 
little later than the clock to the counter. Data hold time is 
unlikely to ever be a problem. But the uncertainties in 
propagation delay through a CMOS clock driver are 
likely to cancel a lot of what could be gained. Further- 
more, delaying the clock to the Am29000 delays the 
address on the initial cycle. 

tPD_MUX 

The 8.0 ns value is based on using a 74F253. A 1/2 ns 
reduction could be realized by building a multiplexer 
with a 16L8-7 (7.5 ns). A better way is to completely 
eliminate the multiplexer delay by building a three-state 
bus. Figure 4 shows one way to do this. 

The counter is implemented with a 16R8-7 (actually, 
more than one is probably required). An 8-bit counter is 
required and 2 additional bits of address must be main- 
tained. Since the clock is not gated, some additional 
inputs are required to indicate whether the counter 
should load, hold, or count. 



Just before RAS falls, the three-state buffer is enabled. 
When the Column Address is required, the three-state 
buffers of the PAL device are enabled and the counter is 
driven into the array. 

In this configuration, a worst-case design requires that 
the extraordinary loading on the PAL device be consid- 
ered. The total capacitance connected to the outputs 
of the PAL devices is greater than the standard load. 
However, the capacitances are distributed rather than 
lumped. The driver never sees the entire load, so the 
wire delay allowance is sufficient. 

tPD WIRE 

The wire delay can be reduced only by reducing the 
wiring length. Instead of connecting all the memory 
chips in serial, the board can be designed so that there 
are two sets of chips connected in parallel. This halves 
the 1 .33-foot length previously calculated and reduces 
the wire delay to 3.22 ns. 

To reduce tSU, a fast Am29000 at a reduced clock rate 
can be used. For example, a 30-MHz Am29000 has a 
tSU of only 5 ns; this is 3 ns better than a 1 6-MHz part, 
but it is expensive. 

Another approach is to insert a pipeline register with a 
very low setup time. For example, the data setup time of 
a 74F374 is only 2 ns. Of course, including a pipeline 
register has adverse consequences. The first access of 
a burst-mode access will then be one SYSCLK cycle 
longerthan would othenvise be required. In addition, the 
control logic is made slightly more complicated. A posi- 
tive side effect is that three-state buffers are included in 
the register packages. Figure 5 shows registers in the 
instruction path. 
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Figure 4. Multiplexer Avoidance 
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Now, assuming the implementation of all the changes 
described above, the fixed numbers become the values 
shown in Table 2. 

Table 2. The Improved Numbers 



Name 



Value 



Obtained From 



If this is plotted as a function of cycle time, the line has 
moved up a considerable amount as compared to 
Figure 3. This indicates that it is possible to build a 
20-MHz system with the fastest available DRAMs. It 
also indicates that it is possible to build a 16-MHz 
system with 100-ns DRAMs. 
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Figure 5. Pipeline Access 
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CONCLUSION 

By using the values for proposed memory architectures 
into Equation 1 , two to four specific values of tMAX can 
be determined for appropriate values of tCLOCK. With 
this information it is easy to draw graphs lil<e those of 
Figures 3 and 6. Such graphs provide a simple display of 
the available trade-offs between system clock rate, 
memory architecture, and the memory device access 
speed. Multiplying the memory device count for each 



configuration by the access-speed driven memory 
device costs of the configuration yields an approximate 
cost for each memory system approach. 

Such an analysis may point out significant cost 
reductions by quickly identifying those situations in 
which a non-interleaved memory architecture and 
reduced clock rate can support the required system 
performance. 
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Implementation of an Am29000 Stack Cache 
Application Note 



by Phil Bunce and Erin Farquhar 



INTRODUCTION 

This application note will describe the basic mecha- 
nisms of the AMD Am29000's cache of the run-time 
stack. The stack cache is an important performance fea- 
ture, because it permits a procedure's entire context to 
be resident in on-chip registers, thus eliminating, or at 
least reducing, the need for memory accesses. 

Our discussion is centered around a single example 
program, which is shown in its entirety in Appendix B. 
Before discussing this example, we provide a brief over- 
view of the basic operation of the stack cache. 



OVERVIEW 

Procedures executing on the Am29000 make use of a 
run-time stack, which consists of consecutive, overlap- 
ping structures called activation records. An activation 
record contains the dynamically allocated information 
specific to a particular activation of a procedure. Each 
time a procedure is called, a new activation record is 
allocated on the stack; when the procedure has finished 
executing, its activation record is deallocated from the 
stack. 

Compilers and assemblers for the Am29000 use two 
run-time stacks for activation records: the register stack 
and the memory stack. A procedure's activation record 
may be divided between these stacks. Both stacks grow 
toward lower addresses in memory, and items on the 
stacks are referenced as positive offsets from RSP 
(Register Stack Pointer) and I^SP (Memory Register 
Stack Pointer). Both pointers are realized using internal 
Am29000 global registers. The global and local regis- 
ters are both subsets of the general-purpose registers. 

The register stack contains parameters passed to the 
procedure, the local scalar variables used by the proce- 
dure, return linkage information, and the arguments that 
the procedure will pass to procedures that it in turn calls. 



The register stack is cached in the local registers, IrO- 
Ir127, as explained below. 

The memory stack is used for local structured data, for 
example, arrays and records. It also is used for addi- 
tional scalar data when needed. When the scalar portion 
of the activation record for a particular procedure 
requires more than 128 words of local-register storage, 
the excess may be kept in the procedure's activation 
record in the memory stack. 

Both stacks are aligned on a double-word (64-bit) 
boundary. Procedures are required to maintain this 
alignment by adjusting the size of the register stack 
frame allocated at procedure entry to be a multiple of 
eight bytes. 

STACK CACHE 

The 128 local registers are used to cache locations in 
the register stack, such that when a procedure is active, 
its entire register-stack activation record is mapped to 
the local registers. 

Each word location in the register stack is mapped to a 
single local register. The registernumbercorresponding 
to a location in the register stack is given by bits 8-2 of 
the 32-bit memory address of that location in the register 
stack. Because there are 128 local registers, quantities 
whose addresses differ by 51 2 (all addresses are byte 
addresses) are mapped to the same local register and 
cannot be in the cache at the same time. 

Figure 1 shows a snapshot of the register stack in mem- 
ory after some calls have been made, and the mapping 
of the register stack to the local registers. As shown in 
the figure, Global Register 1 , called the Register Stack 
Pointer (RSP), contains the 32-bit virtual address of the 
top of the register stack in memory. This virtual address 
on the Am29000 is the lowest-addressed valid stack 
location in the current activation record. 
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Figure 1. Mapping of Register Stack to Stack Cache 



Local registers are addressed as positive word offsets 
from RSP, as in Figure 2. Specifically, when a local reg- 
ister operand is specified in an instruction (that is, the 
most significant bit of the register number is set), the 
seven least significant bits are added to bits 8-2 of RSP 
and the result is truncated to seven bits. For example, If 
RSP has the value 0, as shown in Figure 2, then \rO is 
absolute register 128 (the first local register), and \r1 is 
absolute register 129 (the second local register); if RSP 
has the value four, then /rOis absolute register 129 and 
/r/ is absolute register 130. 

Referring again to Figure 1 , the current activation record 
is delimited by the Frame Pointer (FP), which by soft- 



ware convention uses Local Register 1 , and RSP. FP 
points to the '1op" of the previous activation record, that 
is, to the lowest-addressed word location above the cur- 
rent activation record. When a procedure is active, this 
entire area must be cached in local registers. 

The register stack between FP and RFB (Register Free 
Bound) contains the saved activation records of previ- 
ously called procedures, which are also currently 
mapped to the local-register cache. RFB, by convention 
Global Register 127, is set to point to the lowest- 
addressed word in the register stack that is not mapped 
to the local registers. 
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Figure 2. Local Register Addressing 
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The register stacl< between RSP and RAB (Register 
Allocate Bound) represents stacl< locations (and corre- 
sponding local registers) that are currently "unused" and 
thus available for allocation when another procedure is 
called. RAB (by convention Global Register 126) is set 
to point to the lowest-addressed word in the register 
stack that is currently mapped to a local register. 

When a procedure is called, RSP is decremented by the 
number of words required to accommodate the called 
procedure's activation record. When RSP is decre- 
mented beyond the location pointed to by RAB and thus 
beyond the available local registers, more local regis- 
ters will be required for the activation record, and some 
locations in the stack cache must be written to memory 
(or "spilled") before the new activation record is created. 
This condition is called overflow. Note that in Figure 1 , 
locations between RFB and the Start of Stack are saved 
activation records that have been previously spilled to 
memory. 

On return from a procedure, the activation record is 
de-allocated by incrementing RSP by the same amount 



it was decremented when the procedure was called. If 
the caller's FP (which points to highest location in the 
caller's activation record) is greater than RFB (which 
points to the first unmapped register stack location 
above the activation record), the contents of that portion 
of the register stack will have to be loaded into the local 
registers to accommodate the caller's activation record. 
This condition is called underflow. 

Overflow and underflow conditions are detected by 
instruction sequences in the prologue and epilogue, 
which are the instmction sequences that execute as a 
result of a procedure call and procedure return, respec- 
tively, and cause a transfer of control to the appropriate 
trap handler routine. In the case of an overflow, the trap 
handler moves the contents of the required number of 
local registers to the register stack in memory and 
adjusts the value in RAB and RFB. In the case of an 
underflow, the trap handler loads the required numberof 
register stack locations into the local registers and 
adjusts the value in RAB and RFB. 
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OVERVIEW OF EXAMPLE PROGRAM 

Our example program consists of the four text files listed 
below. 

regdcl.h: Register name declarations 

macros.h: Macro definitions for prologue and epi- 
logue 

start.s: CPU Initialization 

Overflow and Underflow trap handler rou- 
tines 

example.s: Two procedures main and recurse 

Appendix A contains partial listings from the example 
program that are described individually in the sub-sec- 
tions below. 

Appendix B contains the source for the entire example 
program which includes all of the above files. 

INCLUDE FILES 

There are two include files, regdcl.h and macros.h. 
Note that regdcl.h must be Included before macros.h, 
because macros.h uses definitions from regdcl.h. 

In regdcl.h (see Appendix A-1 , Register Declarations), 
we assign the value 80 as the base of registers to be 
used as temporaries by system software. Additional 
temporaries will be addressed as offsets from it. These 
registers will be used for work space in the start code 
and the two trap handler routines. 

.equ SYS_TMP, 80 /system temp registers 

We also assign symbolic names to global and local reg- 
isters, in accordance with the software calling conven- 
tions of the Am29000. 



.reg rsp,grl 
. reg msp,grl25 
.reg rab,grl26 
.reg rfb,grl27 
.reg f p, Irl 
.reg raddr,lrO 



/local reg stack pointer 
/memory stack pointer 
/register allocate bound 
/register free bound 
/ frame pointer 
/return address 



The overflow and underflow trap vectors, V_SPILL and 
V_FILL, are set to the constant values 64 and 65. These 
are the vector numbers for the trap handlers chosen for 
this example. 



.equ 
,equ 



V_SPILL, 64 
V FILL, 65 



LOGUE and EPILOGUE. These macros are discussed 
in the Prologue and Epilogue sections. 

START CODE 

The module start.s contains code that sets up the exe- 
cution environment for our example program. The initial 
portion of the start code is shown in Appendix A-2, Start 
Code. The overflow and underflow trap handlers, also in 
start.s, will be discussed later. 

We set the beginning of the stack (its highest address in 
memory) at 0x5000. The "& ~7" in the expression en- 
sures that the value is a multiple of eight, with rounding 
downward if necessary. 

.equ TOP_STK, (0x5000 & ~7) /create 

/double word 
/alignment 

The two temporary registers, tmpi and tmp2, are 
assigned values that are offsets of SYS_TMP, which 
means that tmpi is Global Register 80, and tmp2 is 
Global Register 81 . 



reg 


tmpi. 


%%(SYS TMP + 0) 


reg 


tmp2. 


%% (SyS_TMP + 1) 



Then we initialize the four pointers that define the stack 
environment. 



const 


rsp, (T0P_STK-8) 


/set stack 
/pointer 


add 


rsp, rsp, 


/update rsp 


const 


rab, (TOP_STK-512) 


/set register 
/alloc bound 


const 


f p, TOP_STK 


/set frame ptr 


const 


rfb,TOP_STK 


/set reg free 



The second include file in our example program, 
macro.h, contains the macro definitions for PRO- 



/ bound 

Figure 3 shows the initialized stack. Because there has 
been no spilling of local registers to the stack in memory, 
RFB points to the top of the stack. RAB is, by definition, 
512 bytes less than RFB. In the initial activation record, 
defined by FP and RSP, FP points to the top of the stack 
(because there has been no prior context) and RSP is 
set to a value eight bytes less than FP to allow for the 
current FP and raddr when a new activation record 
is created. Note that the setting of RSP must pre- 
cede the setting of FP by at least two instructions 
because of the delayed effect of modifying RSP, and 
that an explicit arithmetic or logical instmction must 
be used to update RSP. 

The CPS (Current Processor Status Register) is initial- 
ized with the value 0x0072. Assuming the prior state of 
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this register was Reset mode (shown in Figure 4), we 
have in effect cleared FZ, DA, and RE, and left the other 
bits unchanged. The FZ (Freeze) bit is cleared because 
the processor is unfrozen for nomial operation. (For a 
description of the Freeze bit, refer to the section called 
"Special-Purpose Registers," in the Am29000 User's 
Manual). We clear the DA (Disable All Interrupts and 
Traps) bit to enable all traps. The RE (ROM Enable) bit 
is cleared because this example assumes we are exe- 
cuting from RAM. 



We set the Vector Fetch bit in the Configuration Register 
to select a vectortableconfigurationforthe Vector Area. 



mtsrim cfg, 0x10 



VF 



The VAB (Vector Area Base Address) register, which 
specifies the beginning address of the vector table in 
memory, is set to zero. 

mtsrim vab, 



mtsrim cps,0x72 



PD, PI, SM, DI 



PD, PI, SM, and DI remain set, meaning that address 
translation is disabled (PD and PI), supervisor mode is 
selected (SM), and external intermpts are disabled 
(DI). Supervisor rrwde is selected because some of the 
instructions in our example program are privileged. 
Address translation is disabled because this example is 
designed for systems not using the TLB. External inter- 
rupts are disabled because we have no interrupting 
devices and want to eliminate any spurious interrupt 
requests. 



Next we initialize the vector table with the address of 
the Overflow trap handler routine, called SpillHandler. 
First we load the address of the SpillHandler into a tem- 
porary register, using two CONST instructions for the 
case when SpillHandler is not in the first 64K-bytes of 
memory. 

const tmpl, SpillHandler 
consth tmpl, SpillHandler 
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Figure 3. initialized Stacl( 
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Figure 4. Current Processor Status Register in Reset Mode 



Because each entry in the vector table is four bytes, we 
compute the address in the vector table by multiplying 



the vector number V_SPILL (64) by four (a shift left by 
two). 
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const tmp2,V_SPILL 
sll tmp2, tmp2, 2 /compute vector 
/address 

Then we store the address of SpillHandler (in tmpi) into 
the vector table address we just computed. 

store 0, 0,tmpl, tmp2 ; write spill 
; vector 

Initializing the vector table with the address of the under- 
flow trap handler routine (vector number V_FILL) is 
done the same way: 

const tmpl,FillHandler 

consth tmpl,FillHandler 

const tmp2,V_FILL 

sll tmp2,tmp2,2 ; compute vect 

; addr 
store 0, 0, tmpl, tmp2 ; write fill 

; vector 

The procedure start then calls main, passing it the return 
address (IrO). A NOP follows the call because the 
Am29000 always executes one instruction beyond a call 
instruction before the call is taken. 

call raddr,main 
nop 

halt ;halt after successful 
/completion 



EXAMPLE FUNCTIONS MAIN() AND RECURSE() 

After the start code has executed, control is passed to 
the procedure main(). The purpose of main{) is to call 
the procedure recurse(), providing it with an initial set of 
values. RecurseO calls itself a total of 86 times, then 
returns to itself 86 times before returning to main(). An 
overflow condition occurs with the 21st call, and each 
subsequent call causes an additional spill of local regis- 
ters to memory. When the program returns, the 22nd 
return causes an underflow condition, and each subse- 
quent return causes an additional fill from memory to the 
local registers. 

The basic operation of main() and recurse() is summa- 
rized by the following program: 

mainO 
{ 

recurse (1, 42) ; 
} 

recurse (n,m) 
int n,m; 
{ 

int i, j; 

if (n > 85) return; 

i = n + 1; 

recurse (i,m) ; 
} 

The code for main() and recurse() is shown in Appendix 
A-3 and A-4, Code for Main() and Code for RecurseO, 
respectively. 
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PROLOGUE 

As with all Am29000 procedures, main() begins with a 
prologue. The macro definition of PROLOGUE and 
the expansion of PROLOGUE for main() are shown in 
Appendix A-5 and A-6, Prologue Macro and Prologue 
Expansion for Main(), respectively. 

The purpose of PROLOGUE is to allocate an activation 
record and check for overflow before the body of the pro- 
cedure is executed. It is invoked with three parameters: 
the number of arguments passed (INCNT), the number 
of registers required for the procedure's local variables 
(LOCCNT), and the maximum number of arguments 
that the procedure may pass to any one function it in turn 
calls (OUTCNT). 

.macro PROLOGUE, INCNT, LOCCNT, OUTCNT 

The values of ALLOC_CNT and SIZE_CNT are com- 
puted from the parameters. 

.set ALLOC_CNT, ( (2+OUTCNT+LOCCNT+l) &~1) 
.set SIZE_CNT, (ALL0C_CNT+2+INCNT) 

ALLOC_CNT is the amount of space on the stack that 
must be newly allocated by the Prologue for the proce- 
dure's activation record. SIZE_A is the amount of space 
that must be accessible by the procedure, that is, the 
size of its activation record. 

The expression for ALLOC_CNT does not use INCNT, 
because incoming parameters were already allocated 
space on the stack as the outgoing parameters (OUT- 
CNT) of the calling procedure. "2" is the number of 



words needed for the called procedure's FP and return 
address when it calls another procedure. ANDing the 
expression with the complement of 1 (& ~1) maintains 
double-word alignment on the stack by setting the least 
significant bit to zero. The "+1 " ensures that the amount 
is rounded up, not down. 

The expression for SIZE_CNT includes INCNT and two 
additional words for IrO (return address) and FP of the 
caller. 

The three macro variables, IN_PRM, LOC_REG, and 
OUT_PRM are used to establish offsets into the stack 
for input, local, and output arguments. These macro 
variat)les are set only if the corresponding value of the 
parameter is not equal to zero. 

.if (INCNT) 

.set IN_PRM, (2 + ALLOC_CNT + 0x80) 
.endif 
.if (LOCCNT) 

.set LOC_REG, (2 + OUTCNT + 0x80) 
.endif 
.if (OUTCNT) 

.set OUT_PRM, (2 + 0x80) 
.endif 

In the above, a macro variable is set equal to an expres- 
sion that is evaluated to a local register number when 
the program is assembled. The macro variable can then 
be used as the base register for offset addressing of 
parameters of that type (as shown in Figure 5). The 
"0x80" provides the 125-word offset required for a local 
register access. 
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Figure 5. Prologue Parameters 
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The body of the PROLOGUE macro has three instruc- 
tions: 



sub 

asgeu 

add 



rsp, rsp 
V_SPILL, 
fp, rsp. 



(ALLOC_CNT « 2) 
rsp, rab 
(SIZE CNT « 2) 



In the above instructions, ALLOC_CNT and SIZE_CNT 
are shifted left by two to convert them from word quan- 
tities to the required byte quantities (the stack regis- 
ters, whose contents will be modified, contain byte 
addresses). 

The first instruction allocates an activation record by 
decrementing RSP by the amount ALLOC_CNT. 

The second instruction asserts that RSP of the new acti- 
vation record is greater than or equal to RAB. If this is 
not the case, (that is, RSP has been decremented 
beyond RAB), an overflow trap occurs, and there is a 
transfer of control to the trap handler routine, 
SpillHandler, pointed to by the vector V_SPILL. The trap 
handler will move (spill) the contents of the required 
number of local registers to the register stack in memory 
and adjust RFBand RAB, as described in the Overflow 
Trap Handler section. 

The third instruction sets FP to point to the location just 
above the new activation record, so it can be used 
for underflow checking in the EPILOGUE macro of a 
procedure that is called by this procedure (see Epilogue 
section). 

After the prologue, main() calls recurse(). The expan- 
sion of PROLOGUE for recurse() is shown in Appendix 
A-7, Prologue Expansion for Recurse(). 

OVERFLOW TRAP HANDLER 

On the 21st call to itself, recurse() causes an overflow 
trap. The code that services this trap is shown in Appen- 
dix A-8, Overflow Trap Handler, and is described below. 



In the following discussion of SpillHandler, we assume 
the reader is familiar with the processor's response to 
traps. If not, referto the section called Interrupt and Trap 
Handling in the Am29000 User's Manual. 

The first three .reg directives assign symbolic names 
to the three temporary system registers used by 
SpillHandler. 

.reg R_Cnt, %% (SYS_TMP+0) ;temp for 

; count 

.reg R_TmpPCO, %% (SYS_TMP+1) ;temp for 

;PCO 

. reg R_TmpPCl, %% (SYS_TMP+2) ;temp for 
;PC1 

The old PCs are saved in two of the temporary registers 
just declared. 



mfsr R_TmpPCO, pcO 
mfsr R TmpPCl, pel 



;save the PCs 



The CPS (Current Processor Status Register) is set to 
the value 0x73. This clears the FZ (Freeze) bit, which 
was set by hardware when the trap was taken (see 
Figure 6), so that the trap handler can execute a Store 
Multiple instruction. (Note that the PCs must be saved 
before the FZ bit is cleared.) The DA (Disable All Inter- 
rupts and Traps) bit remains set, which prevents the 
processor from taking any traps except the *WARN, 
Instruction Access Exception, Data Access Exception, 
and Coprocessor Exception traps. PD, PI, SM, and Dl 
also remain set. 



mtsrim cps,0x73 



PD, PI, SM, DI, DA 



Now we can use the Store Multiple instruction to store 
the required number of local registers into the register 
stack in memory. This instruction requires a source, a 
destination, and a count. 
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As explained earlier (and shown In Figure 1), the area 
between RSP and RAB represents the local regis- 
ters available for allocation when a procedure is called. 
Because there has been an overflow and RSP has been 
decremented beyond RAB, we can compute the size of 
the required spill (the count for the Store Multiple) by 
subtracting RSP from RAB. 



sub R Cnt,rab,rsp 



;R_Cnt = number of 
; bytes to spill 



Then we use R_Cnt to adjust RFB, so that it correctly 
reflects the area in the register stack that will be mapped 
to the local registers. 



sub rfb,rfb,R Cnt 



/move down the 
; frame bound 



Before using the Load Multiple instruction, R_Cnt must 
be written as a word anrount into the CR field of the 
Channel Control register, which is used by the proces- 
sor to determine the number of loads to memory. So we 
convert R_Cnt from a byte to a word amount using the 
Shift Right Logical instruction. 



srl R Cnt,R Cnt, 2 



;R_Cnt = count of 
; words to spill 



Because the CR field is zero-based, we subtract one 
from R_Cnt 

sub R_Cnt,R_Cnt, 1 /correct for storem 

and then use the Move to Special Register instruction 
to write it to the CR field. 



The local registers that have to be spilled are those cor- 
responding to register-stack locations between RSP 
and RAB, because they are the local registers that must 
be occupied by the new activation record. So the in- 
struction source will be IrO, which corresponds to RSP. 
The instruction's destination will be the register-stack 
location pointed to by the previously modified RFB, be- 
cause that is the register-stack location at the correct 
512-byte offset from RSP. 

storem 0, 0, IrO, rfb ; spill from the 
/allocated area 

Then we set RAB to point to the top of stack, because 
that is now the lowest stack address currently cached in 
local registers. 



add 



rab, rsp, 



/move down the 
/allocate bound 



We set CPS to the value 0x473. This sets the FZ bit, 
which must be set before we restore PCO and PC1 . PD, 
PI, SM, Dl, and DA remain set. 



mtsrim cps, 0x473 



/FZ, PD, PI, SM, 
/DI, DA 



Then the two PCs are restored and the I RET (Interrupt 
Return) instruction restores the previous contents of 
CPS from the Old Processor Status Register, unfreezes 
the processor, and begins fetching from PCO and PC1 . 



mtsr pcO, R_TmpPCO 
mtsr pel, R_TmpPCl 
iret 



/restore the PCs 



mtsr cr,R_Cnt 



/ set up count for 
/storem 
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EPILOGUE 

When recurse has called itself 86 times, it retums and 
executes an Epilogue. The EPILOGUE macro is shown 
in Appendix A-9, EPILOGUE Macro. 

EPILOGUE'S first instruction de-allocates the proce- 
dure's activation record by adding ALLOC_CNT to RSP. 
This is followed by a NOP, because a change in the 
value of RSP must be separated by at least one cycle 
from an instruction that references a local register (in 
this case, the instruction JMPI, whose operand raddrls 
IrO). 

add rsp, rsp, {ALLOC_CNT « 2) 

nop 

jmpi raddr 

Before the Jump Indirect instruction finishes executing, 
the next instmction, ASLEU, is executed. This instruc- 
tion asserts that the caller's FP, now restored because 
the caller's RSP has been restored, is less than or equal 
to RFB. If the assertion is false (which means that FP is 
pointing to an unmapped, previously spilled register- 
stack location), an underflow trap occurs, and control is 
transferred to the trap handler routine, FillHandler, 
pointed to by the vector V_FILL. The trap handler will 
move the contents of locations in the register stack 
to the local registers and adjust RAB and RFB, as 
described in the Underflow Trap Handler section. 

asleu V_FILL, fp, rfb 

At the end of the Epilogue, the parameters are set to an 
illegal value. This ensures that if they are used again 
before they are explicitly set, an assembly-time errorwill 
be reported. 

.set IN^PRM, (1024) /illegal, to 

; cause 
;err on ref 

.set LOC_REG, (1024) /illegal, to 

; cause 
;err on ref 

.set OUT_PRM, (1024) /illegal, to 

/cause 
/err on ref 

.set ALLOC_CNT, (1024) /illegal, to 

/cause 

/err on ref 

The expansion of EPILOGUE for recurse() is shown in 
Appendix A-10, Epilogue Expansion for Recurse(). 



UNDERFLOW TRAP HANDLER 

On the 22nd return of recurse() to itself, an underflow 
trap occurs. The code that services this trap is shown in 
Appendix A-1 1 , Underflow Trap Handler, and is dis- 
cussed below. 

The two old PCs are saved in temporary registers 
declared in the SpillHandler routine. 



mf sr 
mf sr 



R_TmpPC0, pcO 
R_TmpPCl, pel 



/save the PCs 



The CPS (Current Processor Status Register) is set to 
the value 0x73. This clears the FZ bit, so that the trap 
handlercan execute a Load Multiple instruction. The DA 
bit remains set, which prevents the processor from tak- 
ing any traps except the *WARN, Instruction Access 
Exception, Data Access Exception, and Coprocessor 
Exception traps. PD, PI, SM, and Dl also remain set 

mtsrim cps, 0x73 /PD, PI, SM, DI, DA 

We will use the Load Multiple instruction to load loca- 
tions in the register stack into the local registers. The 
Load Multiple instruction requires a source, a destina- 
tion, and a count. 

Clearly, the source for the Load Multiple instruction is 
the location pointed to by RFB, since RFB points to the 
first location in the register stack that was previously 
spilled from the local registers. 

The destination of the Load Multiple instruction will, of 
course, be the local register corresponding to RFB. 
Local registers may be specified as instruction oper- 
ands in one of two ways: using a local register number 
(in the range from to 127), or using the absolute regis- 
ter number (in the range 126 to 255) in an Indirect 
Pointer Register. With the first method, the local register 
number is computed as a positive word offset of RSP. 
This option is not available to us because the trap han- 
dler has no way of knowing the offset from RSP (that is, 
the local register number) corresponding to RFB. 

So we will convert the address in RFB to an absolute 
local register number, put this number in Indirect Pointer 
A (because the destination operand uses Indirect 
Pointer A), and then specify Global Register (which 
indicates an indirect pointer access) as the destination 
register in the Load Multiple instruction. 

To convert the address in RFB to an absolute local reg- 
ister number, we OR it with 512. This sets bit 9, which 
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selects a local register; bits 2-8 give the absolute local 
register number. 



Because the CR field is zero-based, we subtract one 
from R cm 



const R_Cnt,512 ;make local reg 

;ip 

or R_Cnt,R_Cnt, rfb ;from rfb 

Then we use the Move To Special Register instruction to 
put this value in the Indirect Pointer A Register. 

mtsr ipa, R_Cnt ;set up indirect 
;ptr 
;for loadm 

Recalling that the underflow trap was signaled because 
FP is pointing to an unmapped and previously spilled 
register stack location at a higher memory address than 
RFB, we can compute the number of local registers to fill 
by subtracting RFB from FP. 



sub 



R_Cnt, fp, rfb 



;R_Cnt = # of 
; bytes to fill 



We use the just-computed value to adjust RAB, so that 
it correctly points to the new lower bound of the regis- 
ter stack mapped to local registers. We perform this 
operation now because it requires a byte amount, and 
R_Cnt will be converted to a word amount in the next 
instruction. 



add 



rab, rab, R Cnt 



;move up the 
/allocate bound 



Before use of the Load Multiple instruction, the count 
must be written as a word amount into the CR field 
of the Channel Control Register. Hence, we convert 
R_Cnt from a byte to a word amount using the Shift 
Right instruction. 



srl R Cnt,R Cnt, 2 



;R_Cnt = number of 
; words to fill 



sub R_Cnt, R_Cnt,l /correct for loadm 

and then use the Move to Special Register instruction to 
write it to the CR field. 



mtsr cr, R Cnt 



; set up count for 
; loadm 



Now we use the Load Multiple instruction to transfer the 
contents of the register stack in memory to the local reg- 
isters, specifying RFB as the address in the register 
stack from which to load, and grrO (indirect Pointer A) as 
the local register number at which to begin the fill. 

loadm 0,0,grO,rfb ;fill area freed 

After the registers have been filled, we update RFB so 
that it correctly points to the upper bound of the register 
stack that is currently cached. 



add rfb, fp, 



;move up frame bound 



We set CPS to the value 0x473. This sets the FZ bit, 
which must be set before we restore PCO and PC1 . PD, 
PI, SM, Dl, and DA remain set. 

mtsrim cps, 0x473 ;FZ, PD, PI, SM, 

;DI, DA 

Then the two PCs are restored and the IRET (Intermpt 
Return) instruction restores the previous contents of 
CPS, unfreezes the processor, and begins fetching from 
PCOandPd. 

mtsr pcO,R_TmpPCO /restore the PCs 

mtsr pel, R_TmpPCl 

iret 
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APPENDIX A: 

PARTIAL LISTINGS EXTRACTED FROM EXAMPLE PROGRAM 



A-1. REGISTER DECLARATIONS 



Global registers 



. equ 
. reg 
, reg 
.reg 
, reg 



SYS_TMP, 80 
rsp, grl 
msp, grl25 
rab, grl26 
rfb, grl27 



system temp registers 
local register stack pointer 
memory stack pointer 
register allocate bound 
register free bound 



Local compiler registers 
(only valid if frame has been established) 



reg 
, reg 



fp, Irl 
raddr, IrO 



frame pointer 
return address 



Vectors 



.equ 
.equ 



V_SPILL, 64 
V FILL, 65 
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A-2. START CODE 



start; 



.include 


"regdcl.h" 


• equ 


TOP_STK, (0x5000 & ~7 


.text 


.global start 


.reg 


tmpl, %%{SYS_TMP + 0) 


.reg 


tmp2, %%(SYS_TMP + 1) 


const 


rsp, (T0P_STK-8) 


add 


rsp, rsp, 


const 


rab, (TOP STK-512) 


const 


f p, TOP_STK 


const 


rfb,TOP STK 



create double word aligned value 



set stack ptr 

set shadow rsp 

set reg alloc bound 

set frame ptr 

set reg free bound 



set correct mode 

mtsrim cps, 0x72 
mtsrim cfg, 0x10 
mtsrim vab, 



PD, PI, SM, DI 
VF 



connect up spill handler 

const tmpl, SpillHandler 
consth tmpl, SpillHandler 
const tmp2,V_SPILL 
sll tmp2,tmp2,2 
store 0, 0,tmpl, tmp2 



compute vect addr 
write spill vector 



connect up fill handler 

const tmpl,FillHandler 

consth tmpl,FillHandler 

const tmp2,V_FILL 

sll tmp2,tmp2,2 

store 0, 0, tmpl, tmp2 



compute vect addr 
write fill vector 



call main program 

call raddr,main 

nop 

halt 



halt after successful completion 
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A-3. CODE FOR MAINQ 



.include "regdcl.h' 
.include "macros. h' 
.global main 



; mainO 



; recurse (1, 42) ; 
; ) 



PROLOGUE 0,0,2 



invoke macro ic, loc, 2 og 



; name outgoing args 

.reg M_out_n, %% (OUT_PRM + 0) 
.reg M_out_m, %% (OUT_PRM + 1) 



; recurse (1, 42) 
const 
call 
const 



M_out_m, 42 
raddr, recurse 
M_out_n , 1 



EPILOGUE 
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A-4. CODE FOR RECURSE() 

.global recurse 

recurse (n,m) 

{ 

int i, j; 

if (n > 85) return; 
i = n + 1; 
recurse (i,m) ; 
) 

recurse: 

PROLOGUE 2,2,2 ; invoke macro 2 ic, 2 loc, 2 og 

; name ic args 

.reg R_in_n, %%(IN_PRM + 0) 
.reg R_in_m, %% (IN_PRM + 1) 

; name locals 

,reg R_i, %%(LOC_REG + 0) 
•reg R_j, %% (LOC_REG + 1) 

; name outgoing args 

.reg R_out_n, %% (OUT_PRM + 0) 
.reg R_out_m, %% (OUT_PRM + 1) 

; name temporary register 

. reg R_tmp, IrO 

; if (n > 85) return 

cpgt R_tmp, R_in_n, 85 
jmpt R_tmp,rec_01 

; i = n + 1 

add R_if R_in_n, 1 

; recurse (i,m) 

add R_out_m, R_in_m, 

call raddr, recurse 

add R_out_n, R_i, 

rec_01: 

EPILOGUE 
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A-5. PROLOGUE MACRO 



macro PROLOGUE 

Parameters: INCNT input parameter count 

LOCCNT local register count 

OUTCNT output parameter count 



.set 
.set 
.set 
.endif 



ALLOC_CNT, ((2 + OUTCNT + LOCCNT + 1) & ~1) 
SIZE_CNT, (ALLOC_CNT + 2 + INCNT) 
IN PRM, (2 + ALLOC CNT + 0x80) 



.if (LOCCNT) 

.set 
.endif 



LOC REG, (2 + OUTCNT + 0x80) 



.if (OUTCNT) 

.set 
.endif 



OUT PRM, (2 + 0x80) 



sub 
asgeu 
add 
. endm 



rsp, rsp, {ALLOC_CNT « 2) 

V_SPILL, rsp, rab 

fp, rsp, (SIZE_CNT « 2) 



A-6. PROLOGUE EXPANSION FOR MA1N() 



PROLOGUE 0,0,2 

.set ALLOC_CNT, ((2+2+0+1) & 

.set SIZE_CNT, (ALLOC_CNT +2+0) 

.set OUT_PRM, (2 + 0x80) 

sub rsp, rsp, (ALLOC_CNT « 2) 

asgeu V_SPILL, rsp, rab 

add fp, rsp, (SIZE_CNT « 2) 



invoke macro 
~1) 
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A-7. PROLOGUE EXPANSION FOR RECURSEQ 



recurse: 



PROLOGUE 2,2,2 



; invoke macro 



•set ALLOC_CNT, ((2+2+2+1) & ■ 

.set SIZE_CNT, (ALLOC_CNT + 2 +2) 

.set IN_PRM, (2 + ALLOC_CNT + 0x80) 

.set LOC_REG, (2 + 2 + 0x80) 

.set OUT_PRM, (2 + 0x80) 

sub rsp, rsp, (ALLOC_CNT « 2) 

asgeu V_SPILL, rsp, rab 

add fp, rsp, (SIZE_CNT « 2) 



■1) 



A-8. OVERFLOW TRAP HANDLER 

.reg R_Cnt, %% (SYS_TMP + 0) 

.reg R_TmpPCO, %% (SYS_TMP + 1) 

.reg R_TmpPCl, %% (SYS_TMP + 2) 

SpillHandler 



temp for count (shared) 
temp for PCO 
temp for PCI 



.global 
SpillHandler: 

This routine handles a false assertion in the standard prologue, 
In: rab > rsp (requiring an allocation) 



Out; 



rab > rsp 


(r 


Irl <= 


rfb 




rfb == 


rab 


+ 512 


rab == 


rsp 


(J 


Irl <= 


rfb 




rfb = rab 


+ 512 


mfsr 




R_TmpPCO, pcO 


mfsr 




R TmpPCl, pel 


mtsrim 




cps, 0x73 


sub 




R_Cnt, rab, rsp 


sub 




rfb, rfb, R_Cnt 


srl 




R_Cnt, R_Cnt, 2 


sub 




R_Cnt, R_Cnt, 1 


mtsr 




or, R_Cnt 


storem 




0, 0, IrO, rfb 


add 




rab, rsp, 


mtsrim 




cps, 0x473 


mtsr 




pcO, R_TmpPCO 


mtsr 




pel, R_TmpPCl 


iret 







(just enough allocated) 



save the PCs 

PD, PI, SM, DI, DA 

R_Cnt = # of bytes to spill 

move down the frame bound 

R_Cnt = count of words to spill 

correct for storem 

set up count for storem 

spill from the allocated area 

move down the allocate bound 

FZ, PD, PI, SM, DI, DA 

restore the PCs 
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A-9. EPILOGUE MACRO 



macro EPILOGUE 




macro EPILOGUE 




add 


rsp, rsp, (ALLOC_ 


nop 




jmpi 


raddr 


asleu 


V_FILL, fp, rfb 


.else 




jmpi 


raddr 


nop 




.endif 




.set 


IN_PRM, (1024) 


.set 


LOC_REG, (1024) 


.set 


OUT_PRM, (1024) 


.set 


ALLOC CNT, (1024) 


.endm 





illegal, to cause err on ref 

illegal, to cause err on ref 

illegal, to cause err on ref 

illegal, to cause err on ref 



A-10. EPILOGUE EXPANSION FOR RECURSEQ 

EPILOGUE 

add rsp, rsp, (ALLOC_CNT « 2) 

nop 

jmpi raddr 

asleu V FILL, fp, rfb 
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A-11. UNDERFLOW TRAP HANDLER 

•global FillHandler 

FillHandler: 
This routine handles a false assertion in the standard epilogue. 
In: Irl > rfb (requiring de-allocation) 



Out: 



Irl > 


rfb 


(req 


rsp >= 


rab 




rfb == 


rab 


+ 512 


Irl == 


rfb 


(jus 


rsp >= 


rab 




rfb = 


cab 


+ 512 


mfsr 




R_TmpPCO, pcO 


mfsr 




R_TmpPCl, pel 


mtsrim 




cps, 0x73 


const 




R_Cnt, 512 


or 




R_Cnt, R_Cnt, rfb 


mtsr 




ipa , R_Cnt 


sub 




R_Cnt, Irl, rfb 


add 




rab, rab, R_Cnt 


srl 




R_Cnt, R_Cnt, 2 


sub 




R_Cnt, R_Cnt, 1 


mtsr 




cr, R_Cnt 


loadm 




0, 0, grO, rfb 


add 




rfb, Irl, 


mtsrim 




cps, 0x473 


mtsr 




pcO, R_TmpPCO 


mtsr 




pel, R_TmpPCl 


iret 







(just enough freed) 



save the PCs 

PD, PI, SM, DI, DA 
make local reg ip 

from rfb 
set up indirect ptr for loadm 
R_Cnt = # of bytes to fill 
move up the allocate bound 
R_Cnt = number of words to 
correct for loadm 
set up count for loadm 
fill area freed 
move up frame bound 
FZ, PD, PI, SM, DI, DA 
restore the PCs 
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APPENDIX B: 

COMPLETE LISTING OF EXAMPLE PROGRAM 



.include "regdcl.h" 

.equ TOP_STK, (0x5000 & ~7) 



: create double word 
: aligned value 



start : 



.text 

.global start 

.reg tmpl, (SYS_TMP + 0) 

. reg tmp2, (SYS_TMP + 1) 



const rsp, (T0P_STK-8) 

const rab, (TOP_STK-512) 

const fp,TOP_STK 

const rfb,TOP STK 



:set stack ptr 

:set reg alloc bound 

:set frame ptr 

:set reg free bound 



;set correct mode 
mtsrim cps, 0x72 
mtsrim cfg, 0x10 
mtsrim vab, 



;PD, PI, SM, DI 



/connect up spill handler 
const tmpl, SpillHandler 
consth tmpl, SpillHandler 
const tmp2,V_SPILL 
sll tmp2,tmp2,2 
store 0, 0, tmpl, tmp2 



; compute vect addr 
rwrite spill vector 



/connect up fill handler 
const tmpl,FillHandler 
consth tmpl,FillHandler 
const tmp2,V_FILL 
sll tmp2,tmp2,2 
store 0, 0, tmpl, tmp2 



r compute vect addr 
rwrite fill vector 



;call main program 
call raddr,main 
nop 



halt 



rhalt after successful completion 



•The routines below handle overflow and underflow conditions. 
The temps which they use are given below. 



, reg R_Cnt, (SYS_TMP + 0) 
, reg R_TmpPCO, (SYS_TMP + 1) 
reg R_TmpPCl, (SYS_TMP + 2) 



;temp for count 
;temp for PCO 
;temp for PCI 



(shared) 
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•global SpillHandler 
SpillHandler: 
/•This routine handles a failed assertion in the standard prologue 



;In:rab > rsp( requiring an allocation) 

;fp <= rfb 

;rfb == rab +512 



;Out:rab == rsp{just enough allocated) 

;fp <= rfb 

;rfb 



rab + 512 




mf sr 


R_TmpPCO, pcO 


mfsr 


R_TmpPCl, pel 


mtsrim 


cps, 0x73 


sub 


R_Cnt, rab, rsp 


sub 


rfb, rfb, R_Cnt 


srl 


R_Cnt, R_Cnt, 2 


sub 


R_Cnt, R_Cnt, 1 


mtsr 


cr, R_Cnt 


storem 


0, 0, IrO, rfb 


add 


rab, rsp, 


mtsrim 


cps, 0x473 


mtsr 


pcO, R_TmpPCO 


mtsr 


pel, R TmpPCl 



:save the PCs 



rPD, PI, SM, DI, DA 

rR_Cnt = # of bytes to spill 

rmove down the frame bound 

;R_Cnt = count of words to spill 

r correct for storem 

;set up count for storem 

r spill from the allocated area 

rmove down the allocate bound 

;FZ, PD, PI, SM, DI, DA 

; restore the PCs 



iret 



.global FillHandler 
FillHandler: 

;This routine handles a failed assertion in the standard epilogue 

;In:fp > rfb (requiring de-allocation) 

; rsp >= rab 

;rfb == rab +512 

;Out: fp == rfb (just enough freed) 

; rsp >= rab 

;rfb == rab + 512 



mfsr 


R_TmpPCO, pcO 


mfsr 


R_TmpPCl, pel 


mtsrim 


cps, 0x73 


const 


R Cnt, 512 


or 


R_Cnt, R_Cnt, rfb 


mtsr 


ipa , R_Cnt 



;save the PCs 



;PD, PI, SM, DI, DA 

;make local reg ip 

;from rfb 

;set up indirect ptr for loadm 
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sub 


R_Cnt, fp, rfb 


add 


rab, rab, R_Cnt 


srl 


R Cnt, R Cnt, 2 


sub 


R Cnt, R Cnt, 1 


mtsr 


cr, R_Cnt 


load 


mO, 0, grO, rfb 


add 


rfb, fp, 


mtsrim 


cps, 0x473 


mtsr 


pcO, R_TmpPCO 


mtsr 


pel, R TmpPCl 



;R_Cnt = # of bytes to fill 
;move up the allocate bound 
rR_Cnt = number of words to fill 
r correct for loadm 
rset up count for loadm 
rfill area freed 
;move up frame bound 

rFZ, PD, PI, SM, DI, DA 

: restore the PCs 



iret 
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INTRODUCTION 

The development of a microprocessor-based system is 
a complicated and detailed undertal^ing that requires 
skilled personnel and efficient test equipment. Because 
of the sophistication of modern microprocessing sys- 
tems, they usually cannot be flawlessly designed on the 
first iteration, and nearly always require extensive 
debugging and testing time. Experienced developers 
know that few designs function perfectly at power-up. 
Faults occur due to en'oneous logic, poor assembly, or 
defective parts, so some debugging is virtually always 
necessary. Therefore, every effort should be made to 
plan the debugging and testing process before the first 
prototype is built. Without advance planning, the 
designer may find that the circuit either cannot be suc- 
cessfully debugged, or that the necessary debug time is 
prohibitive. 

Planners should keep in mind that testing and debug- 
ging continues throughout the life of the product. 
Because different phases in the product life cycle have 
different characteristics, the requirements for each must 
be considered. The major phases of the product life 
cycle are development, production (pilot, limited, and 
large-scale), and field service. 

Apart from the skill of the personnel, the efficiency of test 
equipment is a critical area that affects the testing time 
in every phase. Outdated or ineffective equipment will 
slow down even the nrvDSt highly trained personnel. More 
importantly, expensive, state-of-the-art test equipment 
will be wasted if its use is not preplanned. Careful con- 
sideration must be given to the type of equipment 
needed to service the product, as well as its cost and 
how it will be disbursed to the field. 

AMD offers a comprehensive array of development 
tools that allow development teams to effectively test 
and debug Am29000^*^-based systems throughout the 
life cycle of the product. This document discusses those 
Am29000 development tools, and provides information 
for gauging their usefulness in specific applications 
with respect to cost, capabilities, and target design 
requirements. 

Am29000 DEVELOPMENT TOOLS 

The Am29000 development tools covered in this docu- 
ment are those used for debugging and testing 



actual system hardware. They normally are used with a 
prototype or production system to determine the cause 
of failure, and are distinguished from the 29K™ tools 
used to prepare programs for execution on a target 
system (see the 29K Tool Chain section). 

Figure 1 shows the relationship of these development 
tools to the application and each other. The components 
are described below: 

ADAPT29K— Advanced Development and Prototyping 
Tool. ADAPT29K™ is a standalone system that inter- 
faces to the application like an in-circuit emulator. It pro- 
vides a wide range of debugging functions without 
intruding on the application's execution. 

MON29K— Target Resident Won/tor. MON29Ktm is a 
monitor program that executes on the target Am29000. 
It provides many of the same debugging functions as the 
ADAPT29K, even though it is a software product. 

XRAY29K— Source-Level Debugger. XRAY29KT" is a 
source-level debugging program. It supplies an interac- 
tive, windowed environment for debugging Am29000 
applications using MON29K or ADAPT29K. 

Probe Interface. The Hewlett-Packard® probe interface 
provides an interface between the Am29000 and an HP 
1650 or 16500 logic analyzer. When using a suitable 
logic analyzer, the probe interface allows the tracing of 
Am29000 signals with a 10-ns sample time and disas- 
sembly of Am29000 instructions. 
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ADAPT29K 
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Figure 1. The Am29000 Development Tools 



THE 29K TOOL CHAIN 

The Am29000 development tools discussed in this 
document are a subset of the 29K tool chain, which are 
compatible resources provided by AMD for developing 
Am29000-based systems. Only the tools used for 
debugging are described in this document; other com- 
ponents of the 29K tool chain are needed to create the 
executable object modules that mn on an Am29000- 
based system. 



An object module can be obtained from the set of 
programs shown in Figure 2. Detailed information on 
using the tools to create an executable object module is 
contained in the following documents: 

ASM29K Documentation Set. It provides complete 
information on the installation and use of the ASM29KT" 
assembler, linker, and librarian manager. It also 
includes documentation on the Am29000 utilities. 

HighC29K Documentation Set. It covers how the 
HighC29KT" C compiler for the Am29000 is used. 



3-43 



29K Family Application Notes 




,C (C source file) 

or 
.S (assembly-language source file) 



I 



C0FF2HEX 



PROM 
Programmer 



HighC29K 
Compiler 



ASM29K 
Assembler 




.O (relocatable object module) 



JL 



ASM29K 
Linker 



1 



OUT (absolute object module) 



Binary to ASCII 
BTOA 



I 



.ASC (ASCII object module) 



ADAPT29K or 
MON29K Target 




11014A-02 



Figure 2. Tlie 29K Tool Cliain 



3-44 



Introduction to the Am29000 Development Tools 



REFERENCE MATERIALS 

This document covers only information concerning criti- 
cal requirements to consider during development plan- 
ning. Detailed usage of each tool is not covered. 
Additional information can be found in the following 
documents: 

ADAPT29K User's Manual. It provides detailed infonna- 
tion on the ADAPT29K including installation, com- 
mands, theory of operation, and target design 
requirements. 

MON29K Documentation Set. It provides detailed infor- 
mation on the MON29K including installation, com- 
mands, theory of operation, and target design 
requirements. 

XRAY29K Documentation Set. This set of documents 
includes an installation guide, user's manual, and refer- 
ence guide for XRAY29K, the high-level/assembly- 
language debugger. 

Hewlett-Pacl<ard Probe Interface Data Sheet. It gives a 
description and electrical specifications for the probe in- 
terface. 

These materials can be obtained by writing to: 

Advanced Micro Devices, Inc. 
901 Thompson Place 
P.O. Box 3453 
Sunnyvale. CA 94088-3453 

or by calling 1-800-222-9323. 

For questions that cannot be resolved with the cun-ent 
literature, further technical support can be obtained by 
writing or calling: 

29K Support Products Engineering 

Mail Stop 561 

5900 E. Ben White Blvd. 

Austin, TX 78741 

(800) 2929-AMD (US) 

0-800-89-1131 (UK) 

0-031-11-1129 (Japan) 



HOW TO USE THIS DOCUMENT 

This document discusses the Am29000 development 
environment. However, different readers have different 
requirements and initial levels of knowledge. The layout 
of this document should help readers locate the desired 
information while avoiding redundant or known material. 
In this document, special emphasis is placed on 
answering the questions: 

1 . What is the development tool? 

2. Where does it fit in the 29K tool chain? 



3. What capabilities does the development tool have? 

4. What requirements must be met to effectively use 
the development tool with the target system? 

The "Summary of the Tools" section summarizes the ad- 
vantages and disadvantages of each development tool. 
Their compatibility requirements also are summarized. 

The "Standalone Execution Board" section details the 
Standalone Execution Board (STEB) manufactured by 
STEP Engineering. The STEB is not actually a develop- 
ment tool, but an example of an Am29000 system that is 
compatible with all the development tools. The section 
highlights important areas of the development environ- 
ment, demonstrating how the STEB was designed to 
comply with the compatibility requirements of the devel- 
opment tools. 

Appendix A contains logic diagrams for the Standalone 
Execution Board. These should be used in conjunction 
with the discussion in the "Standalone Execution Board" 
section to show how the STEB was designed to comply 
with the compatibility requirements of the development 
tools. 



ADAPT29K ADVANCED DEVELOPMENT 
AND PROTOTYPING TOOL 

The ADAPT29K is a standalone unit used for non-intru- 
sive supervision and monitoring of the target circuit, 
much like an in-circuit emulator. Completely self- 
contained, it has its own processor, memory, I/O, and 
power supply. It is connected to the target by a cable 
inserted between the Am29000 and its socket. When 
the target is running, the ADAPT29K monitors bus activ- 
ity. When the target is halted, the ADAPT29K can use 
the target Am29000 to modify memory, provide proces- 
sor status, or perform other debugging functions. 
Figure 3 shows the ADAPT29K. 

Either an ASCII terminal or a host computer can be used 
to control the ADAPT29K. The commands have a 
format similar to the DEBUG program on the IBM® PC. 
When using an engineering workstation (running a 
terminal emulator program), screen logging facilities, 
file storage with uploading and downloading, and batch 
file support are available. Also, XRAY29K (see the 
"XRAY29K Source-Level Debugger" section) can be 
run on a mainframe or workstation, providing source- 
level debugging support. See Figure 4. 
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Figure 3. The ADAPT29K 
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Figure 4. Connections to tlie ADAPT29K 
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One major advantage of the ADAPT29K is that, as a 
separate unit mnning on a separate processor from the 
target, hardware control signals can be asserted to gain 
control over the processor, regardless of the state of 
the program executing on the target. This allows the 
ADAPT29K to be used for debugging a system that can- 
not yet mn its program. This type of debugging support 
is often useful when testing a prototype for the first time. 

FEATURES 

The ADAPT29K has powerful debugging capabilities 
that are important when bringing up a new design. For 
example, it is often necessary to inspect or alter memory 
contents, force test conditions, and patch in code 
sections. By using the ADAPT29K, the developer gains 
these capabilities for supervising the processor execu- 
tion, thus greatly facilitating the initial debugging and 
development of Am29000-based applications. 

Display and Modification of i\/lemory 

Using the ADAPT29K, all Am29000 memory spaces 
can be accessed. This includes instruction ROM, 
instmction/data RAM, Am29000 internal registers 
(global, local, and special), and coprocessor registers. 
Target data can be displayed or modified. The contents 
of a register or ranges of memory locations can be 
moved or filled; individual bits of special registers may 
be set separately. Table 1 shows the ADAPT29K com- 
mands available for managing memory. 



Table 1. ADAPT29K Memory Display and 
Modification Commands 

Command Description 

D Display registers/memory 

F Fill registers/memory 

I Input from a port 

M Move memory 

O Output to a port 

S Set registers/memory 

X Display key registers 

XC Display/set coprocessor registers 

XP Display/set protected registers 

XT Display/set TLB registers 

XU Display/set unprotected registers 

Memory operations can be performed in byte, half-word, 
word, floating-point, or double-precision format. For 
example, to display /r4 through Irll as words, enter: 

dw LR4,LR11 

Or, to display addresses FO to FF in instmction/data 
RAM, enter: 

db lOOOOi, lOOlfi 
Figure 5 shows the results of these operations. 



# DW LR4,LR11 

LR004 61006200 63006400 65006600 67006800 a .b.c.d.e.f .g.h. 

LR008 69006a00 6b006c00 6d006e00 6f007000 i. j .k.l .m.n.o.p. 



# DB 10000I,1001FI 

OOOIOOOOI 61 00 62 00 63 00 64 00 65 00 66 00 67 00 68 00 a. b.c.d.e.f .g.h. 

OOOIOOIOI 69 00 6a 00 6b 00 6c 00 6d 00 6e 00 6f 00 70 00 i. j .k.l. m.n.o.p. 

# 

11014A-05 

Figure 5. ADAPT29K Memory Displays 
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Several ADAPT29K commands make displaying of 
common memory groups easier. Frequently, when 
debugging specific areas of an application, the same 
data areas will need to be displayed repeatedly. For 
example, when testing a TLB-miss trap handler, it may 
be necessary to stop program execution after reloading 
the TLB to determine if the proper entry has been 
updated. The TLB entries can be displayed easily using 
the XT command, as shown in Figure 6. 

Likewise, the processor status information contained in 
the special protected registers can be displayed using 
the XP command, as shown in Figure 7. 



Often, the best time to examine menrwry locations is 
immediately after program execution has halted. A 
substantial amount of repetitive key entry can be elimi- 
nated by using the E command, which defines a 
command list that executes whenever the Am29000 
halts. For example, to automatically perform the same 
operations shown in Figure 5 every time the Am29000 
halts, the execution list could be defined as: 

E DW LR4,LR11;DB lOOOOl, lOOlFI 

The next time execution halts, the local registers Ir4 
through Ir8 will be displayed, followed by a display of 
memory locations OOOOOOFO through OOOOOOFF, just as 
it would have occurred if the commands had been 
entered individually. 
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Figure 6. TLB Entries Display 











# xp 

CA IP TE TP TU FZ LK RE WM PD PI SM IM DI DA 
CPS: 000000000000000 
OPS: 000000000000000 

VAB CFG: PRL VF RV BO CP CD 
0000 00 

CHA CHD CHC: CE CNTL CR LS ML ST LA TF TR NN CV 
00000000 00000000 00 00 00 

RBP: BF BE BD BC BB BA B9 B7 B6 B5 B4 B3 B2 Bl BO 
000000000000000 

TCV TR: OV IN IE TRV PCO PCI PC2 MMU: PS PID LRU 
000000 000000 00000000 00000000 00000000 00 

# 

Figure 7. Protected Register Display 
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Execution Control 

The ADAPT29K can completely control target execu- 
tion. Processing may be at full speed, or the target may 
be single-stepped, or it can be run until a breakpoint is 
encountered. Table 2 shows the ADAPT29K com- 
mands that control program execution. 

Table 2. ADAPT29K Execution Control Commands 



Command 


Description 


B 


Breakpoint display, set, and reset 


C 


Check execution state 


E 


End execution command list 


G 


Go (start program execution) 


K 


Kill program execution 


T 


Trace (single step) instructions 



Two types of breakpoints are available: "non-sticky" and 
"sticky." Non-sticky breakpoints are temporary break- 
points set as optional parameters of the G (Start 
Program Execution) command. They are reset when 
program execution stops. Fixed, or "sticky," breakpoints 
are set by using the B command. They remain in effect 
until they are expressly removed. 

Debugging Support 

Because the ADAPT29K was designed to aid debug- 
ging, it has several unique features that aid in testing the 
target. The testing aids include running memory 
diagnostics, assertion of repetitive signals, pulsing inter- 
face lines, and forced execution of Am29000 instruc- 
tions. The commands are shown in Table 3. 



Table 3. ADAPT29K Debugging Commands 

Command Description 

A Assemble in memory 

J Jam an instruction 

L List memory 

P Pulse the reset line 

W Run interface diagnostics 

Z Display trace buffer 



The ADAPT29K's J command forces the processor to 
execute a user-specified Am29000 instmction. Issuing 
the P command pulses the processor reset line, initiat- 
ing a hardware restart. Options of the VV command 
specify various diagnostics to be executed,'including a 
target memory test over a specified range of addresses; 
it also can be used to generate repetitive read and write 
signals for easy triggering of an oscilloscope. 

Bus Tracing 

A real-time bus trace facility is supported. Wheneverthe 
target Am29000 is executing, the ADAPT29K traces 
most CPU pins and stores their states in a 4096 entry 
ring buffer. AH Am29000 signals are traced, except 
INCLK, SYSCLK, CNTLO, CNTL1, *TEST, and 
'RESET. 

The state condition of the traced signals at each bus 
cycle is numbered sequentially and stored as an entry 
in the trace buffer. It may later be displayed to the termi- 
nal or host using the Z command shown in Table 3. A 
range of entries may be displayed in any of three for- 
mats. One (Figure 8) shows the disassembled instmc- 
tions. Another (Figure 9) shows the states of the traced 
control signals. The remaining display is a combination 
of both figures. 
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Figure 8. Bus Trace Display 
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Figure 9. Control Signal Trace Display 













Assembly/Disassembly 

The ADAPT29K has a built-in, in-line assembler/disas- 
sembler that allows instruction memory examination 
and alteration using Am29000 mnemonics rather than 
hex values. The syntax corresponds to the ASM29K 
macro-assembler. 

Serial Ports 

The ADAPT29K has two serial ports. One is a data com- 
munications equipment (DCE) port; the other is a data 
terminal equipment port (DTE). Both ports conform to 
EIA convention RS232. Generally, a user gives com- 
mands to the ADAPT29K from an ASCII terminal or 
engineering workstation connected to the DCE port. 
(See Table 4 for a list of the commands.) A source-level 
debugger, such as XRAY29K (see the "XRAY29K 
Source-Level Debugger" section) running on a remote 
host, would use the DTE port. 

Either port may be used to upload or download pro- 
grams to the target. In this way, a user can control the 
ADAPT29K from an ASCII terminal while downloading 
programs from a remote host connected to the DTE 
port. Both Tektronix® Hex and Motorola® S3 formats 
are accepted. The ports can be connected together, 
enabling the terminal device to communicate with the 
remote host. 

Table 4. ADAPT29K Serial Port Commands 



Command 


Description 


N 

R 
V 
Y 


Change the "normal character" 
(used to connect DCE and DTE ports) 
Enter remote mode 
Save memory to a file 
Load a file to memory 



On-Llne Help 

On-line help is available for all commands. A command 
summary can be obtained by entering: 

H <CR> 

Specific help on an individual command can be obtained 
by entering H followed by the letter of the command. All 
command explanations show the complete command 
syntax and give a short description of how the command 
functions. 

HOW THE ADAPT29K WORKS 

The ADAPT29K runs on a different processor than the 
target. It performs all operations on the target by control- 
ling the target Am29000. A buffered cable connects the 
ADAPT29K to the target's Am29000 socket. Figure 10 
shows the signals carried on the cable. Note that 
although the ADAPT29K traces the address bus, it can- 
not drive it, and, consequently, cannot provide an over- 
lay memory. It uses the target Am29000 to set up all 
memory addresses before it can access them. 

Execution Control 

The execution state of the target Am29000 is controlled 
by using the CNTLO and CNTL1 signals. By asserting 
different combinations of the two signals, the Am29000 
can be placed in one of four states: RUN, HALT, STEP, 
and LOAD TEST INSTRUCTION. How these states 
affect the processor is explained in detail in the 
Am29000 User's Manual, order #10620. 



3-50 



Introduction to the Am29000 Development Tools 



The LOAD TEST INSTRUCTION state should be noted 
due to its importance to the ADAPT29K. Because the 
LOAD TEST INSTRUCTION state intenupts normal 
sequential processing and permits a sequence of 
instructions to be loaded into the processor's instruction 
stream, the ADAPT29K, using the LOAD TEST 
INSTRUCTION STATE, can force the processor to 
perform operations on the target. 



Memory Access 

Due to the high speed of the Am29000, the ADAPT29K, 
unlike some in-circuit emulators, does not provide any 
overlay menrvjry. To maintain real access times, the 
processor must be kept as physically close to its mem- 
ory as possible. There is no time available for the propa- 
gation delay that would be experienced in accessing 
memory across the interface cable to the ADAPT29K. 
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Figure 10. The ADAPT29K-to-Target Interface 
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All target code and data is stored on the target. When 
the ADAPT29K is commanded to display a data object, 
it places the target Am29000 in the LOAD TEST 
INSTRUCTION state. Then a sequence of instructions 
is inserted to store the present Am29000 state, set up a 
new memory address, load the data into an Am29000 
register, store the data to the ADAPT29K, and restore 
the Am29000 state. 

This method imposes certain requirements. Because 
data is transferred between the ADAPT29K and the 
target over the data bus, the target memory must be 
protected from corruption. To prevent inadvertent 
changes to the target menxjry, it must be disabled from 
responding when the ADAPT29K and the target proces- 
sor are transferring data. There are two ways of doing 
this: (1) the memory can be disabled by a low state on 
the PIN169 alignment pin (pin D4), or (2) the target 
memory can be disabled when an 06 hex is decoded on 
the OPT2-OPTo pins. 

When the contents of instruction ROM must be 
displayed, the ADAPT29K must instruct the processor 
to read instruction ROM as data. Hence, a hardware 
path must exist for data stored in the instmction ROM 
space (on the instruction bus) to be loaded into an 
Am29000 register from the data bus. 

Similarly, when the ADAPT29K is used to download a 
program, the code will be written word-by-word to the 
target Am29000, which then writes the instructions into 
proper memory space. Suppose, for example, code is to 
be written into the instruction/data RAM. Because the 
ADAPT29K has no means for virtual translation of 
addresses, it will use Store instructions to write the code 
into the absolute address in the instruction/data space. 
When the Am29000 goes to execute the code, it will ex- 
pect to fetch its instmctions over the instruction bus. 

This requires that there be a hardware path from the 
data bus to the instruction bus and a one-to-one corre- 
spondence between addresses on the data bus and the 
addresses on the instruction bus. This occurs because 
the instruction is stored at an address on the data bus, 
but is fetched via the instruction bus. In other words, in- 
structions fetched from an address in the instruction 
RAM space via the instruction bus must produce the 
exact information as would be retrieved from the same 
address in the data RAM space via the data bus. 

Breakpoints 

Because the Am29000 is one of the fastest commercial 
processors available, there is no practical way to read 
each address on the address bus and compare it 
against a breakpoint table to determine if a break should 
occur, as is done in an in-circuit emulator. The method 
used by the ADAPT29K is to swap a halt instruction into 



memory at the location of the breakpoint. When the 
executing processor encounters the breakpoint, it halts. 
Then, the ADAPT29K, upon detecting the halt, com- 
pares the halt address with the breakpoint table and 
determines if there is a match. If there is, it swaps the 
original instruction back into memory and informs the 
operator that a breakpoint has occun-ed. 

This method of setting breakpoints also contributes to 
the requirement for a one-to-one translation of ad- 
dresses between the data bus and the instruction bus. 
For example, when the ADAPT29K sets a breakpoint in 
the instruction ROM space, it does so by using the target 
Am29000 to read the original instruction, then writes the 
halt into the address location. This is performed as a 
data movement operation, using the bi-directional path 
to the instruction bus discussed in the Memory Access 
section. For the breakpoint to be effective, the executing 
program must encounter the breakpoint at the same ad- 
dress at which it was stored. 

TARGET DESIGN REQUIREMENTS 

Throughout the preceding discussion, it should be clear 
that the ADAPT29K only interfaces to the target via the 
target Am29000, and uses only the target memory for 
storage of the application program. This places certain 
hardware requirements on the application. These are 
listed below. For a specif ic example, see the Standalone 
Execution Board section. 

1 . The physical device in the instruction ROM space 
must be a RAM device if code is to be downloaded 
to the instmction ROM space, or if breakpoints will 
be set in the instruction ROM space. 

2. A bi-directional path must exist between the instruc- 
tion and data buses. 

3. There must be a one-to-one translation between 
instruction bus addresses and data bus addresses. 

4. The ADAPT29K must be able to disable the target 
memory using a low signal on the PIN1 69 alignment 
pin (D4), or when OPTo-OPTz are 06 hex. 

5. Physical clearance must be provided for the con- 
nection of the interface cable at its proper orienta- 
tion. 

6. Signals driven by the ADAPT29K (see Table 5) 
must be open-collector or tri-state. 
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Table 5. Am29000 Signals Driven by the 
ADAPT29K 



Pin 


Configuration 


Alignment pin 


(Input with puli-up resistor)^'^ 


D31-D0 


(Tri-state) 


iai— lo 


(Tri-state) 


DERR 


(input with pull-up resistor)' 


RESET 


(Open coll. pull-up with 1 K ohm 




resistor) 


DRDY 


(pull-up i-esistor)' 


STAT1-STAT0 


(Input) 


TEST 


(Open collector)^ 



1. Pull-up resistors should be 330 to 1000 ohms. 

2. This is an optional configuration. It is used if memory will be 
disabled by the alignment pin (PIN169). 

3. Note that TEST is active longer than RESET. Since all outputs 
will be in a high-impedance state, it may be prudent to pull up all 
Am29000 outputs to avoid ambiguous inputs (to other devices). 



MON29K TARGET RESIDENT 
MONITOR 

MON29K is a target-resident monitor that has function- 
ality similar to the ADAPT29K nwnitor. MON29K 
provides many important debugging capabilities, includ- 
ing memory display and alteration, code uploading and 
downloading, and assembly and disassembly, i-low- 
ever, unlike the ADAPT29K, MON29K is an entirely soft- 
ware product. It resides completely in the target memory 
and executes on the target Am29000 (see Figure 11). 

MON29K has i/0 driver routines to handle two serial 
ports. Either port can be used to receive commands, 
although the hardware must be supplied by the target. 
With the proper hardware, MON29K can receive com- 
mands from an ASCII terminal or a remote host. It also 
can act as the interface between XRAY29K and the 
target. MON29K is supplied in C source code form so 
the I/O drivers and service routines can be modified to fit 
the particular hardware environment. 

Since it is entirely software, MON29K can be perma- 
nently embedded in the product. It takes only 256K of 
address space in instruction ROM; thus, it can remain 
with the application and be used to diagnose problems 
at all stages of the product life cycle, from development 
to field support. 
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Figure 11. MON29K System Connections 
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FEATURES 

MON29K provides powerful testing capabilities. Many 
of MON29K's features are, in fact, the same as the 
ADAPT29K. These include: 

• Display and alteration of memory, I/O ports, and 
registers. Using MON29K, target data can be 
displayed, set, or altered. All Am29000 menrwry 
spaces may be accessed, including: Am29000 inter- 
nal registers (global, local, and special), coprocessor 
registers, instruction/data RAM, or instaiction ROM. 

• In-line assembly and disassembly. MON29K comes 
with a built-in, in-line assembler/disassembler. 
Am29000 instruction mnemonics can be converted to 
machine codes and stored at a specified location, or 
ranges of addresses may be disassembled and 
displayed in mnemonic form. 

• Uploading and downloading of programs. MON29K 
can use two serial ports, assuming they are provided 
by the target hardware. One port is a data communi- 
cations equipment (DCE) port; the other is a data 
terminal equipment port (DTE). Files may be 
uploaded or downloaded in Motorola or Tektronix 
formats. Also, XRAY29K can communicate with 
MON29K through one of the ports. 

• Execution Control. MON29K can control target exe- 
cution. It can initiate full-speed execution, or single- 
step the processor. 

• Set/Reset Breakpoints. Both permanent and tempo- 
rary breakpoints are supported. 

• On-line help. On-line help that shows the complete 
syntax is available for all commands. 

MON29K Commands 

Many of the MON29K commands (and consequently 
the features) are identical to those of the ADAPT29K. 
The MON29K commands, all of which are implemented 
in ADAPT29K, are listed in Table 6. 



Table 6. MON29K Commands 



Command Description 



A Assemble in memory 

B Breakpoint display, set, and reset 

C Check execution state 

D Display registers/memory 

E End execution command list 

F Fill registers/menrory 

G Go (start program execution) 

I Input from a port 

L List memory 

M Move memory 

N Change the "normal character" 

O Output to a port 

R Enter renx)te nvDde 

S Set registers/memory 

T Trace (single-step) instructions 

V Save memory to a file 
X Display key registers 

XC Display/set co-processor registers 

XP Display/set protected registers 

XT Display/set TLB registers 

XU Display/set unprotected registers 

Y Load a file to memory 



Differences Between MON29K and ADAPT29K 

Because MON29K runs on the target processor, not as 
a separate unit, it has limitations that the ADAPT29K 
does not have. In particular, MON29K has no K (Kill), S 
(Jam), Z (Trace), or W (interface diagnostics) com- 
mands. 

MON29K is not able to assert a kill command because 
when the application is running, the application controls 
the processor. Clearly, when MON29K is not in control 
of the processor, it has no means of evaluating serial 
input and taking 29K polled the serial I/O device, but 
such continuous polling would hinder real-time execu- 
tion. Instead, to allow programs to be forcefully termi- 
nated, MON29K can be configured to respond to 
interrupt-driven serial I/O. When MON29K is initialized 
to respond to interrupt-driven serial I/O, it intercepts a 
CTRL-C and passes control to a handler that recovers 
the processorto MON29K. This technique is effective in 
most cases, except if the application program has 
reached a HALT instaiction. Then, the system must be 
reset. Usage of interrupt-driven serial I/O is determined 
as an option of the command (not present on the 
ADAPT29K). 
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TARGET DESIGN REQUIREMENTS 

MON29K does place some requirements on the target 
design. They are listed below. For a sample implemen- 
tation of the compatibility requirements, see the Stand- 
alone Execution Board section. 

1. The physical part in the instruction ROM space 
must be a RAM device if the code will be down- 
loaded to the instruction ROM space, or if break- 
points will be set in the instruction ROM space. 

2. The Am29000 cannot write on the instruction bus, 
so a bi-directional path must exist between instruc- 
tion and data buses. 

3. Instruction bus addresses must produce the same 
data as data bus addresses. 

4. As a target-resident monitor, MON29Kdoestake up 
some of the target memory; thus, sufficient memory 
must be provided for MON29K. An application using 
MON29K must have 256 Kbytes of memory in the 
instruction ROM space for the program, and a 64- 
Kbyte workspace in instmction/data RAM. Both 
spaces must begin at address (Or and Od). 

5. If program control must be recovered from the appli- 
cation before it ends or returns control normally, 
accommodations must be made to use interrupt- 
driven serial I/O. When interaipt-driven serial I/O is 
used, a MON29K interrupt routine will handle a 
CTRL-C by terminating the application program and 
returning control to MON29K. 

6. MON29K expects the serial I/O driver to be an 8530 
serial communications controller. Using a different 
I/O driver will require modifications to be made to 
MON29K. 

7. AMD cannot anticipate every possible scenario in 
which the Am29000 will be introduced, and it is 
possible that M0N2gK will require some modifica- 
tions to the I/O drivers and service routines before it 
can run on the target. Although binary code is avail- 
able from AMD, MON29K is supplied in source code 
form. Of course, any changes will have to be com- 
piled using a C compiler that produces object mod- 
ules for the Am29000. 



XRAY29K SOURCE-LEVEL 
DEBUGGER 

XRAY29K, the high-level/assembly-level debugger, is a 
program that provides an Interactive, windowed en- 
vironment for debugging Am29000-based systems. 
Using XRAY29K, program statements may be read in 
source language, and data objects may be modified and 
changed by referencing symbol names. Thus, target op- 



erations can be performed using source-level 
constructs, rather than machine codes and numeric 
addresses. To further clarify the target environment, 
XRAY29K's multi-window interface simultaneously 
displays user-selected program information. 

Commands are issued to XRAY29K using a compre- 
hensive debugger command language. The language 
supports a wide range of functions, including setting 
breakpoints, single-stepping, and examining or altering 
any C- or assembly-language variables. The language 
syntax is very similar to C, and also supports debugging 
commands, creation of symbols during a debugging 
session, and convenient specification of address 
ranges. 

XRAY29K resides on a host system and communicates 
with the target system through either the ADAPT29K or 
MON29K. Frequently, the host system is an engineering 
workstation attached to the ADAPT29K, as shown in 
Figure 1 2. In that system, XRAY29K provides a comfort- 
able user-interface, while operations are asserted on 
the target by the ADAPT29K. Alternately, XRAY29K 
could reside on a mainframe and communicate with a 
target running MON29K. The user interface could then 
be done via an ASCII terminal. 

FEATURES 

XRAY29K supports source-level debugging in either of 
two modes: high-level or assembly-level. In high-level 
mode, an application can be debugged using C- 
language expressions and statements. In this way, C 
variables and expressions replace numeric addresses 
for memory access, and the code can be viewed by line 
number or procedure name. 

In assembly-level mode, an application can be 
debugged using assembly-language statements. The 
assembly-level mode additionally allows machine-level 
register and status bit manipulation. 

Commands are given to XRAY29K using its powerful 
debugger language, thus gaining access to XRAY29K's 
full range of debugging services. The services include: 

• Setting and examination of memory and register 
contents using the declared format and the variable 
name. 

• Simple and complex breakpoints that can be set and 
removed in either C-language or assembly-language 
source code. 

• Single-step and full-speed program execution. 

• Assembly and disassembly of object code. 

• Simulated I/O and intermpts. 

• Execution time measurement. 
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XRAY29K 
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PC or Terminal 
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Figure 12. XRAY29K System Connections 



The commands for manipulating memory and registers 
are shown in Table 7. 

Table 7. XRAY29K Memory and 
Register Commands 

Command Description 

compare Compare two blocks of memory 

copy Copy a memory block 

fill Fill a memory block with values 

search Search a memory block for a value 

setmem Change the values of memory 

locations 
setreg Change a register's contents 

test Examine memory area for invalid 

values 

Commands for controlling program execution are listed 
in Table 8; otherdisplay commands are listed in Table 9. 

Table 8. XRAY29K Breakpoint and 
Execution Commands 



Table 9. XRAY29K Display Commands 



Command 


Description 


breakinstmction 


Set an instruction breakpoint 


clear 


Clear a breakpoint 


go 


Start or continue program execution 


gostep 


Execute macro after each 




instmction step 


step 


Execute a number of instaictions or 




lines 


stepnocail 


Step, but execute through 




procedures 



Command 


Description 


disassemble 


Display disassembled memory 


dump 


Display memory contents 


expand 


Display a procedure's local 




variables 


find 


Search for a string 


fopen 


Open a file or device for writing 


fprintf 


Print formatted output to a viewport 


list 


Display C source code 


monitor 


Monitor variables 


next 


Find string's next occurrence 


nomonitor 


Discontinue monitoring variables 


printf 


Print formatted output to command 




viewport 


printvalue 


Print a variable's value 



Windowed Information Display 

XRAY29K shows all critical program information at once 
in multi-windowed displays. The contents of the run- 
time stack, the selected general-purpose registers, the 
current source lines being executed, or virtually any 
other program information, can be checked at a glance, 
without the need to constantly request each piece of 
information individually. 

Information is grouped into screens, which are com- 
posed of one or more windows of specific data called 
viewports. There are three predefined screens: high- 
level, assembly-level, and standard I/O. Distributed 
among these screens are the 17 pre-defined viewports 
listed in Table 10. 
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Figure 13 shows the high-level mode screen display. It 
has four viewports: data, trace, code, and command. 
This screen is displayed when an object module gener- 
ated by a C source program is executed. 



Figure 14 shows the assembly-level mode screen 
display. It has five viewports: data, stack, disassembled 
code, registers (Am29000), and command. This screen 
is displayed when an object module generated by an 
assembly-language program is executed. 



Table 10. XRAY29K Predefined Viewports 



Viewport 



Description 



Command(2) Debugger commands are submitted to XRAY29K from this viewport. There is a command view- 
port for Ixjth high-level and assembly-level modes. 

Code(2) Displays source code in high-level mode or disassembled instructions in assembly-level nwde. 

Data(2) Displays monitored variable expressions in high-level and assembly-level mode. 

Trace Shows the procedure calling chain (high-level mode only). 

Stack Shows stack contents beginning from the stack pointer (assembly-level mode only). 

Register Displays current values of Am29000 registers (assembly-level mode only). 

Status Line(2) Used for debugger command information such as CPU type, current nxjdule name, and current 
operation. This viewport is present in both high-level and assembly-level modes. 

Standard I/O Shows interactive information being received from the std.in or sent to the std.out. 

Break Shows breakpoint information such as number, address, module name. Temporarily overlays top 

of screen when breakpoint is encountered. 

Error Appears when an error occurs to indicate type and source of error. 

Help Shows on-line help information when requested. 

Log Displays logged keystrokes. 

Journal Shows all previous commands and their results. 



1 
2 
3 
4 
5 
6 
7 
8 
9 
10 



1 


UATA 


3 -1 


TRACK ^-^ ^ 4 — 1 

1. 000018C4!??????\\<unknown> 


2 






0. 00010004:CRTO SWstart 


3 








4 








5 








6 









/* sievex.c — scaled down sieve with maxprime_2 instead of 8091 */ 
/* Eratosthenes Sieve prime number calculation */ 

fdefine maxiter 1 
#define maxprime_2 9 

extern void printi\(\); 
extern void prints\(\); 

extern char output; 



Command 



29000 MODULE: CRTO_S 
====== COMMAND : 



BREAK #: HELP=F5 V# 1.0 

====== 1 = 



Note: in startup routine. Press F9 to go to main. 
> host 
> 
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Figure 13. The Standard htigh-Levei-iVIode Screen 
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DATA: 



12 ri 



= STACK = 14 — 

LR5 =00000000 
=00000000 
=00000000 
=00018000 
=00080000 



LR4 
LR3 
LR2 
LRl 



126->LR0 =000018C4 



00010004 
00010008 
OOOIOOOC 
00010010 
00010014 
00010018 
OOOIOOIC 
00010020 
00010024 
00010028 
0001002C 



25010110 
5E40017E 
15810118 
0300838C 
02008301 
03008240 
03017921 
72450101 
030083B0 
02008301 
03008241 



■ CODE ™ 
SUB 
ASGEU 
ADD 
CONST 
CONSTH 
CONST 
CONST 
ASNEQ 
CONST 
CONSTH 
CONST 



11 



grl,grl,OxlO 

0x40,grl,grl26 

lrl,grl,0xl8 

Ir3,0x8c 

lr3, 0x10000 

Ir2,0x40 

grl21, 0x121 

Ox45,grl,grl 

Ir3,0xb0 

lr3, 0x10000 

Ir2,0x41 



registers: 



13 =T 



cha=000019FC 
chd=00000000 
chc=00008116 
q =00000000 
pc0=00010008 
pcl=00010004 
pc2=00010004 

grO =00000000 
gr64=00000B84 
gr65=00000000 



vab=0000 

ops=0060 

cps=0060 

cfg=01-ll 

rbp=003F 

tmc=FF62 

tmr=0FFFF62 



mu =301 
lru=00 
alu=000 
bp =00 
fc =00 
cr =00 



grl 

gr96 

gr97 



=0007FFF8 
=00000210 
=00000000 



Command 



2 9000 MODULE: CRT0_S 

1= COMMAND : 



BREAK #; 



HELP=F5 V# 1.0 

= 10 =n 



auto halt at address 0x00010004 

Note: in startup routine. Press F9 to go to main, 



Figure 14. The Standard Assembly-Level Mode Screen 
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The standard I/O screen has one regular viewport: the 
standard I/O viewport, although the breakpoint, error, 
and help viewports also will appear. The standard I/O 
screen is used when interactive input is requested from 
the standard input device, or when output is directed to 
the standard output device. 

The viewport commands, shown in Table 1 1 , control the 
way information is displayed on the screen. By using the 
viewport commands, a viewport's size, color, and cursor 
position can be changed. Viewports can be added or 
deleted, and custom screens and viewports can be 
defined. 



Table 11. XRAY29K Viewport Commands 
Command Description 

vactive 

vclear 

vclose 

vcolor 

vmacro 

vopen 

vsetc 
zoom 



Activate a viewport 

Clear data from a viewport 

Remove user-defined viewport or 

screen 

Select viewport colors 

Attach a macro to a viewport 

Create a screen or viewport or change 

size 

Set a viewport's cursor position 

Increase or decrease a viewport's size 



Utility Functions 

In addition to its powerful features for execution control 
and display of system information, XRAY29K provides 
several utility features. These features ease debugging 
by streamlining the routine operations. The services 
include command keys, macros, and command files. 

Command Keys 

The most frequently used XRAY29K functions have 
been assigned to a key combination referred to as a 
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"command key." By using command keys, common 
debugger commands can be entered with the minimum 
number of keystrokes, often only one key or a CTRL-key 
combination. 

Macros 

XRAY29K has a powerful, multifaceted macro facility. 
Because a macro may contain complex user command 
procedures, which are executed by entering the macro 
name on the command line, the facility can be used for 
several purposes. Table 12 shows the debugging 
language's macro-related commands. 

Table 12. Macro Commands 

Command Description 

define Create a macro 

show Display the macro source 



Macros can be invoked when a breakpoint is encoun- 
tered. Powerful conditional and tooping statements in 
the command language allow the macro to evaluate 
program or register variables, and alter program flow 
depending on their condition. Hence, macros can be 
used to establish very complex breakpoints that take 
specific action, depending on their environment. 

Macros also can be attached to user-defined viewports. 
When the associated window is opened, the macro will 
execute. This type of macro can write specific data into 
the window, which is useful for nronitoring environ- 
mental infomiation. 

Command/Batch Files 

XRAY29K can process command files. A command file 
contains one or more debugger commands that can be 
processed by XRAY29K automatically, without the need 
for user interaction. This is also called batch-mode 
operation. Command files can be used to recreate a 
debugging session, easily implement automated test 
procedures, and eliminate reentering of frequently used 
command sequences. 

Other XRAY29K Utility Functions 

XRAY29K possesses several other utility functions. 
These include services for manipulating symbols, 
evaluating expressions, setting display and recording 
modes, and controlling the session. Table 13 lists 
the symbol commands. Table 14 lists the miscella- 
neous utility commands, and Table 15 lists the session 
commands. 



Table 13. Symbol Commands 



Command 


Description 


add 


Create a symbol 


delete 


Delete a symbol from the symtx)l 




table 


printsymbols 


Display symbol, type, and address 


scope 


Specify current module and proce- 




dure scope 


Table 14. 


Miscellaneous Utility Commands 


Command 


Description 


cexpression 


Calculate an expression's value 


erro 


Set include file error handling 


help 


Display on-line help screen 


include 


Read in and process a command file 


log 


Record debugger commands and 




errors in a file 


mode 


Select debugger mode (high-level or 




assembly-level) 


option 


Set debugger options for this session 


pause 


Pause simulation 


reset 


Simulate processor reset 


restart 


Reset the program starting address 


startup 


Save the default start-up options 


Table 15. Session Command 


Command 


Description 


host 


Enter the host operating system envi- 




ronment 


load 


Load an object module for debugging 


quit 


End a debugging session 



TARGET DESIGN REQUIREMENTS 

XRAY29K itself places no restrictions on the target 
hardware design. However, being strictly a software 
product, XRAY29K needs a hardware connection to 
the target. For debugging Am29000-based systems, 
XRAY29K must be used in conjunction with either 
ADAPT29K or MON29K; the target design require- 
ments for those tools apply. 
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XRAY29K requires a host system. Versions of 
XRAY29K currently exist for UNIX and DOS environ- 
ments. 

XRAY29K works only with object files that have been 
compiled in such a way that they contain debugger infor- 
mation regarding line numbers, etc. Thus, to use 
XRAY29K, either the ASM29K macro-assembler or 
HighC29K cross-compiler must be used, as well as the 
ASM29K linker. These are explained in the "29K Tool 
Chain" section. 

Am29000 PROBE INTERFACE 

The Am29000 probe interface provides a non-intrusive, 
low-capacitance connection to an Am29000. Inserted 
between the processor and its socket, the probe inter- 
face makes the Am29000 pins available for convenient 



attachment to a logic analyzer or other test equipment. 
Figure 15 shows the probe interface. 

The software available with the probe interface supplies 
configuration information about the Am29000 pins and 
instruction mnemonics to either an HP 1650 or 16500 
logic analyzerfor display formatting. When the display is 
formatted, the logic analyzer will disassemble instruc- 
tions into mnemonics and display processor, bus, and 
error status, as well as data bus activity. Figure 16 
shows how the probe interface is connected between 
the logic analyzer and the target. 

Although the probe interface was designed for the HP 
1650 or 16500 logic analyzer, any type of test equip- 
ment can be attached to it. The following discussion 
assumes a connection to an HP 1650 or 16500 logic 
analyzer, unless othenwise stated. 
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Figure 15. The Probe Interface 
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Logic Analyzer 




Am29000-Based 
System 
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Figure 16. Connection of the Probe interface 



FEATURES 

The probe interface can add important event-trigger- 
ing and high-speed (10 ns) resolution capabilities, 
including: 

• Convenient connection to the target. 

• Low-capacitance probing. 

• Completed status information, including identification 
of burst, pipeline, and simple accesses. 

• Status reporting of bus conditions, such as slave 
accesses, wait states, and co-processor transfers. 

• User-configurable setup and hold parameters allow 
triggering on a specific target condition. 

• Monitoring of ail Am29000 signals except INCLK. 

The probe interface comes with the disassembler, 
configuration files, and a user's manual. 



DISPLAYS 

Figure 17 shows data bus information, as would be 
shown on an HP 16500 logic analyzer. Figures 18 and 
19 show signal state and timing screens and the disas- 
sembly screen for the 16500 analyzer. 

TARGET DESIGN REQUIREMENTS 

Because the probe interface only monitors Am29000 
signals, there are no particular target compatibility 
requirements except for sufficient clearance to install 
the probe interface. Most applications will not be 
affected by low-capacitance, high-impedance connec- 
tion; however, see the probe interface data sheet for 
electrical and physical specifications. 

Apart from supporting the physical size and electrical 
specifications of the connection, a logic analyzer is 
needed. The logic analyzer should have 80 to 1 60 state 
channels. Some termination adapters also are needed, 
depending on the number of state channels on the logic 
analyzer. 
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Figure 17. HP 16500 Data Bus Information Display 
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Figure 18. HP 16500 Signal and Timing Display 
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1 29K INST 


J — State 


Listinc 
















Markers | 


Off 


J 




Label 


> ADDR 


AM29000 Disassembly 


STAT 


Base 


> Hex 


mnemonics 


Hex 


-0247 


000018A0 


CONSTH 


GR85.0x00FF 


*cont 


brst 


E747 


-0246 


000018A0 


MTST 


TMC.GR85 


*cont 


brst 


E747 


-0245 


000018A0 


CONSTH 


GR85.0x01ff 


*cont 


brst 


E747 


-0244 


000018A0 


MTST 


TMR.GR85 


*cont 


brst 


E747 


-0243 


000018A0 


CONSTN 


GR84, -0x0001 


*cont 


brst 


E747 


-0242 


000018AO 


IRET 




*cont 


brst 


E747 


-0241 


000018A0 
000018A0 
000018A0 


ASNEQ 
JMP 


68,SP,SP 
-Ox00004+PC 
IBUS = 70400101 


*cont. brst 
*cont. brst 
*int ret 


E747 


1-0240 1 


E747 


-0239 


E75F 


-0238 


00004000 




IBUS = C67A0B00 


wait 


state 


64D6 


-0237 


00004000 




IBUS = CE000B50 


wait 


state 


61D6 


-0236 


00004000 




IBUS = CE000B50 


wait 


state 


61D6 


-0235 


00004000 




IBUS = CE000B50 


wait 


state 


61D6 


-0234 


00004000 


SUB 


SP,SP,0xl0 


brst 


init 


6146 


-0233 


00004000 


ASGEU 


64,SP,GR126 


cont 


brst 


6147 
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Figure 19. HP 16500 Disassembly Listing 



SUMMARY OF THE TOOLS 

From the sections on ADAPT29K,MON29K,XRAY29K, 
and the probe interface, it should be clear that a com- 
prehensive range of tools exists for developing 
Am29000-based systems. Each of the available tools 
has unique characteristics that make it more advanta- 
geous in particular situations. Depending on the charac- 
teristics of the application, one or all of the tools may be 
needed. This section summarizes the information 
presented in the previous sections with emphasis on 
highlighting what conditions are most appropriate for a 
particular tool or tool combination, and what compatibil- 
ity requirements are placed on the target as a result of 
the tool selection. 

SELECTION GUIDE 

In the development phase of virtually any Am29000- 
based system, eitherthe ADAPT29K or MON29K will be 
needed. It is possible to debug a microprocessor system 
with only a logic analyzer and a PROM programmer, but 
this method is not very practical when compared against 
the following ADAPT29K and MON29K features: 

• Memory display and modification, including special 
registers. 

• Uploading and downloading of programs. 

• Execution control, including setting breakpoints and 
single-stepping. 

Apart from the advantages gained from MON29K and 
the ADAPT29K, their performance can be augmented in 



certain situations if they are combined with XRAY29K 
and/or the probe interface with a logic analyzer. The 
following questions highlight the critical target charac- 
teristics that suggest the optimum tool selection. 

How much memory does the target have? 

Perhaps the nrast crucial factor in deciding whether the 
ADAPT29K or MON29K is most appropriate depends 
on the size of the available target memory. This deter- 
mines whether or not MON29K can be used. Because 
MON29K is target resident, it is necessary that the 
target have at least 256 Kbytes of space in instruction 
ROM, and 64 Kbytes of instruction/data RAM for 
MON29K's workspace. An application without this 
memory space will not be able to use MON29K, and will 
have to use the ADAPT29K. 

For systems with sufficient memory, MON29K, 
ADAPT29K, or both may be used. While both have 
excellent debugging features, the ADAPT29K has some 
features MON29K does not, including: 

• Can halt a failing program 

• Provides a bus trace facility 

• Can force execution of an Am29000 instruction 

• Provides memory diagnostics 

• Can be used with a target that cannot run its 
program 
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It should be noted that in most cases (see the Differ- 
ences Between MON29K and ADAPT29K section), 
MON29K can halt a crashed program if an interrupt- 
driven serial I/O is provided on the target, and the target 
still is responding to interrupts. 

How many units will be produced? 

The number of units to be produced determines the 
volume over which the development and servicing costs 
can be defrayed. The ADAPT29K, while more powerful 
than MON29K, costs more and will raise the amount 
of nonrecurring charges that must be recovered. Of 
course, the difference will be insignificant for the advan- 
tages gained in large volumes. In fact, it may be advis- 
able to use the ADAPT29K when the product is in 
development and final test, using f\/ION29K for field 
service. 

How and where will sen/icing be performed? 

Servicing can be performed on-site or at service cen- 
ters. Often this depends on the size, function, and value 
of the application system. If the system is moved to a 
sen/ice centerfor repair, the ADAPT29K will provide the 
most capabilities, particularly when coupled with the 
probe interface and XRAY29K. 

However, the ADAPT29K may be too bulky to perfomn 
maintenance on-site. MON29K can be embedded in the 
application and used to diagnose faults via a portable 
ASCII terminal or PC (which could run XRAY29K). 

How complex is the program? 

If the program is complex, XRAY29K should be consid- 
ered. Debugging complex programs using hex values 
and physical addresses can be very time consuming 
and error prone, especially programs containing many 
modules. Often, XRAY29K's windowed interface and 
source-level debugging language will greatly reduce 
time spent tracking down errors encountered in address 
calculations, decimal to hex conversions, or just looking 
up values in a listing. 

SUMMARY OF COMPATIBILITY REQUIREMENTS 

Once a combination of tools has been selected, it is 
important to ensure that they will be compatible with the 
target system. The following lists summarize the com- 
patibility requirements for each tool. More detailed 
explanations can be found in the specific sections 
related to the particular tool. 



ADAPT29K 

1 . The target must support RAM in instruction ROM. 

2. A bi-directional path must exist between the instruc- 
tion and data buses. 

3. There must be a one-to-one translation of 
addresses between buses. 

4. Target memory must be disabled either by a low 
signal on the alignment pin (D4), or when OPT?- 
OPTi are 06 hex. 

5. There must be physical clearance for the connec- 
tion of the interface cable at the proper orientation. 

6. The signals driven by the ADAPT29K must be open- 
collector or three-state. 

MON29K 

1 . The target must support RAM in instruction ROM. 

2. A bi-directional path must exist between the instruc- 
tion and data buses. 

3. There must be a one-to-one translation of 
addresses between buses. 

4. The system memory must include 256 Kbytes in 
instruction ROM beginning at Address to store the 
MON29K program, and 64 Kbytes of instmction/ 
data RAM at Address for MON29K's workspace. 

5. If program control must be recovered from the 
application without it ending or returning control 
normally, accommodations must be made to use 
interrupt-driven serial I/O. 

6. The I/O drivers may have to be modified. 

XRAY29K 

1. Requires a host system, such as an engineering 
workstation. 

2. Requires MON29K or ADAPT29K. 

Probe Interface 

1 . Requires a logic analyzer (an HP 1 650 or 1 6500 is 
recommended). 

2. Requires termination adapters. 

3. There must be sufficient physical clearance to allow 
the probe to be attached to the target. 
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A COMPATIBILITY EXAMPLE: 
STANDALONE EXECUTION BOARD 

The Standalone Execution Board (STEB) Is an excellent 
example of compatibility with all the development tools. 
It is a complete Am29000-based system that can mn 
many types of programs, including the software pack- 
ages MON29K and VRTX32y29000®. 

The STEB can also be used with the ADAPT29K and/or 
the HP probe interface. STEB also can be used as an 



execution vehicle for application software or a compari- 
son system for isolating hardware faults. 

This section focuses on how the STEB's design 
achieves compatibility with the development tools. The 
major areas of the STEB are discussed, with emphasis 
on how each area contributes to compatibility. See 
Figure 20 for a block diagram of the STEB. 
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Figure 20. Block Diagram of the STEB 
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FUNCTIONAL DESCRIPTION 

Mounted on a single card, tiie STEB contains an 
Am29000 witli memory, I/O, and system timing 
resources. See Appendix A for sciiematic diagrams, 
Sheets 1 througfi 12. In addition to tiie Am29000 (U51 
on Slieet 2) , tiie STEB supports tiie Am29027 aritlimetic 
accelerator (U1 on Sheet 3). The Am29027 is capable 
of high-speed, single-precision and double-precision 
arithmetic using fixed and floating-point numbers. It can 
be operated in pipelined or non-pipelined (flow-through) 
mode, depending on system capability and require- 
ments. The pipelined mode maximizes the overall 
execution time for scalar operations. 

System timing can be provided by one of two methods. 
The Am29000 itself can generate the system clock, 
which is output on the SYSCLK pin; or circuitry on the 
b)oard (U8, U9 on Sheet 4) can generate an external 
clock signal that can be applied to the SYSCLK pin of the 
processor. Clock selection is done by jumpers. 

Memory is supported in both the instruction ROM and 
instruction/data RAM spaces. By using dip switch (SW3 
on Sheet 7), between 0-7 wait states may be selected. 
Each space has its own wait-state generator, and may 
be configured separately, depending on the access 
speed of the installed memory devices. 



A 9513A timing controller is installed at U55-58, and 
U64 on Sheet 10. The 951 3A supports up to five 16-bit 
counters. Address decoding for various timer functions 
is provided by a PAL (U56 on Sheet 10). The clock 
source can be from the Am29000, a hardware oscillator, 
or a crystal oscillator. 

Power to the STEB is provided by a series-regulated 
power supply that provides a regulated +12 VDC and 
+5 VDC to the t>oard. Connectors are furnished for at- 
tachment to the type of power supply used with PCs. 

CIRCUIT AREAS CONTRIBUTING TO 
COMPATIBILITY 

In the following section, circuit sections related to 
compatibility issues are described. The circuit sections 
are referenced by their locations on the STEB, as 
indicated in Figure 21 . 

ADAPT29K and MON29K Compatibility 

Because the ADAPT29K and MON29K are very similar 
to each other, several STEB design aspects simultane- 
ously address their compatibility requirements. These 
include the type of memory supported, and the bus 
architecture for accessing memory. 
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Figure 21. Data Read from Instruction/Data RAM 
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Support for RAM Devices in the Instruction ROM 
Space 

The STEB supports RAM in the instruction ROM (U25, 
U32 on Sheet 5) space and the instruction/data RAM 
(U33-U43 on Sheets 6 and 7) space. The instmction 
ROM space has a maximum capacity of 1 024 Kbytes 
and uses 27010 EPROMs. The instmction/data RAM 
space has a maximum capacity of 5 1 2 Kbytes and uses 
32-Kbyte x 8 static RAMs. 

Instructions may be executed from either space. So that 
programs can be downloaded via the ADAPT29K or 
MON29K, the instruction ROM area can be constructed 
from 32-Kbyte X 8 static RAMs. However, the maximum 
memory size using RAM is limited to 256 Kbytes. 

Swap Buffer 

On the STEB, a swap buffer provides the necessary 
bi-directional path between the data bus and the instruc- 
tion bus (U11-U14 on Sheet 2). The swap buffer is 
created from four 74ALS245 octal bus transceivers. 
Transfer direction and timing are controlled by the 
transceiver's ENA and A->B inputs. By d ecodin g the 
DREQ Ti-DREQTo. IREQT, OPT2-OPT0, DREG, and 
IREQ signals (U17, U18, U49 on Sheet 4) and applying 



the result to the transceiver, the STEB channels data 
between the buses at the appropriate time. 

The swap buffer Is not required in many straightf onward 
operations. For example, when assembling/disassem- 
bling instructions or reading/writing other data into the 
instmction/data RAM space, data is written directly to 
the instmction/data RAM space overthe data bus. Like- 
wise, a standard instruction fetch from the instmction 
ROM space does not require the swap buffer, as instmc- 
tions may be loaded directly into the processor's instruc- 
tion pre-fetch buffer from the instruction bus. 

However, when disassembling instmctions in the 
instmction ROM space, the instmctions must be read as 
data, which makes the swap buffers necessary. The 
configuration of the IREQT bits causes an instruction 
to be accessed from the instmction ROM, gated onto 
the data bus, and read into the processor. Note the 
combination of control signals indicated on the side of 
the figure. They are used to select the path for data 
movement. 

Similarly, when instructions are fetched from the in- 
stmction/data RAM, they must be transferred to the in- 
struction bus from the data bus. The direction of data 
movement is shown by the darkened path in Figure 22. 
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Figure 22. Instruction Fetch from Instruction/Data RAM 
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One-TO'One Address Translation 

Note that addresses in lx)th memory spaces have a 
one-to-one translation. This means that when a data ob- 
ject is stored at a given address in the instruction/data 
RAM space, the exact same data object will be retrieved 
when the same address is asserted by an instruction 
fetch to the instaiction/data RAM space. This is an 
important requirement for assuring compatibility with 
the ADAPT29K and MON29K because when they are 
downloading programs, they store instructions as data 
over the data bus. Neither tool has the capability to 
translate a virtual address, so when the program is 
executed it must find its instructions at their absolute 
addresses. 

ADAPT29K Compatibility 

In addition to the elements discussed in the ADAPT29K 
and MON29K Compatibility section, certain considera- 
tions were added to the STEB's design strictly for the 
ADAPT29K. These include tri-stating the control lines 
driven by the ADAPT29K and disabling mennory during 
data transfers to and from the ADAPT29K. 

TrI-Stated Control Lines 

The STEB must relinquish some control lines to the 
ADAPT29K when it is operating. Therefore, these lines 
are tri-stated or open-collector, as was described in 
Table 7, thus preventing contention that they may cause 
unpredictable results. 

When the ADAPT29K is not connected to the target, the 
CNTLo and CNTLi lines are pulled high to ensure that 
the processor is in a normal mode of operation. When 
the ADAPT29K is connected to the target, it isolates the 
CNTL1-CNTL0 signals from the board. Any use of those 
signals by the application will be inhibited. 

Memory Disable 

The STEB supports both methods of disabling memory 
for ADAPT29K accesses. Via a jumper selection, the 
STEB can be configured to either decode an 06 hex on 
the OPT bits or disable memory when the alignment pin 
is low. 

When Jumper JP7 (on Sheet 7) has pins 1 and 2 
connected together it causes the SEL_OP signal to PAL 
U20 (on Sheet 7) to be high. The ROM/RAM decode 
circuit (composed of U15, U20, U21 , and U24 on Sheets 
6 and 7) then decodes the OPT2-OPT0 pins to deter- 
mine whether or not memory should be enabled. 

Memory is disabled by a low state on the alignment pin 
(D4) when jumper JP7 is used to connect pins 2 and 3 
together. The low condition is decoded by the ROM/ 



RAM decode circuit, which then disables memory. 
When the ADAPT29K is not installed, the alignment pin 
is pulled high to prevent inadvertent and/or intermittent 
memory disables. 

MON29K Compatibility 

Apart from the requirements mentioned in the 
"ADAPT29K and MON29K Compatibility" section, 
MON29K needs at least one, and preferably two, serial 
port(s) to communicate with the host/operator. It also 
needs sufficient memory to contain the software. 

Serial Ports 

The serial ports are provided by the 8530 serial commu- 
nications controller (SCC) and support circuits located 
at U1 . U2, and U5-U7 (on Sheet 8). The SCC is a dual- 
channel, multi-protocoldata communications peripheral 
designed for use with 8-bit jnd 1 6-bit microprocessors. 
The interrupt request line INT can be wired to provide 
a trap or interrupt to the processor for MON29K. 
Dip switches on the board are used to select port 
characteristics. 

Because the 8530 is a dual-port device, it supports Isoth 
the DTE and DCE RS232 ports on the STEB. The ports 
are standard RS232 ASCII ports. The DCE can be used 
to communicate with an ASCII terminal or PC running a 
terminal emulator; the DTE port can communicate with a 
remote host such as a UNIX machine. 

Because the C language does not differentiate between 
address spaces, the serial ports must be memory- 
mapped into the Am29000 data space. This require- 
ment allows C code to be used in place of assembly 
language. 

Sufficient Memory Space 

Sufficient memory is provided on the STEB for 
MON29K. There is also room for additional application 
programs in the ROM space. The space normally is con- 
figured with MON29K in EPROMs (Bank 0), and RAM in 
the remaining banks. MON29K then could be used to 
download an application into the RAM in the instruction 
ROM space. 

MON29K also uses 64 Kbytes of workspace in RAM. 
This is provided for, with additional space available for 
use by the application program. 

Built-in Probe Interface 

The STEB includes built-in probe interface connectors. 
Thus, test equipment like the HP1650 or 16500 logic 
analyzer can be connected directly to the STEB, elimi- 
nating the requirement for a separate probe interface. 
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Appendix A: STEB Schematic Diagrams 
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INTRODUCTION 

Source code for a given application must be converted 
to executable Am29000T" object code and transferred to 
the appropriate storage media before it can be executed 
in a real system. Usually several utilities are involved; 
these include: 

• Assemblers 

• Compilers 

• Linkers 

• Format translators (optional, depending on the desti- 
nation media) 

This application note shows how an example program in 
source code form is made into object code and down- 
loaded to a target board with the ADAPT29KT" 
Advanced Development and Prototyping Tool, or pro- 
grammed into PROMs. 

THE 29K TOOL CHAIN 

The 29KT" tool chain is used to produce the executable 
object module. The tool chain is an integrated set of pro- 



grams that includes compilers, assemblers, linkers, and 
format translators. These programs perform the opera- 
tions necessary to translate the source code into a 
machine-readable format. The components of the 29K 
tool chain are: 

• HighC29K™ Compiler 

• AS[^29KTM Assembler 

• ASM29K Linker 

• C0FF2HEX (COFF to hexadecimal translator) 
• ROMCOFF 

• BTOA (binary to ASCII translator) 

Figure 1 shows the relationship of the 29K tool chain 
elements to each other. In the following discussion, 
familiarity with these tools is assumed. Consult the 
appropriate reference manuals for more details. 

The 29K tool chain can be run under UNIX®, SunOS®, or 
DOS, but it must be installed properly on the host sys- 
tem before the following example can be performed. 
The host in the following discussion is assumed to be an 
IBf^® AT® or compatible. 
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SUGGESTED REFERENCE MATERIALS 

Consult the following reference mate rials for nwre infor- 
mation on the topics covered in this application note. 

• Am29000 Streamlined Instruction Processor User's 
Manual, order #10620. It contains details regarding 
the instruction set and register organization of the 
Am29000. 

• Am29000 Streamlined Instruction Processor Data 
Sheet, order #09075. It embodies a great deal of 
information about the Am29000, including: distinctive 
characteristics, general description, simplified system 
diagram, connection diagram, pin designations and 
descriptions, functional description, absolute maxi- 
mum ratings, operational ranges, DC characteristics, 
switching characteristics and wave-forms, and physi- 
cal dimensions. 

• ADAPT29K User's Manual. It provides detailed infor- 
mation on the ADAPT29K, including installation, 
commands, theory of operation, and target design 
requirements. 

• ASM29K Documentation Set. It provides complete 
information on the installation and use of the ASM29K 
assembler, linker, and librarian manager. This 
includes information on using the ROMCOFF and 
C0FF2HEX utilities. 

• HighC29K Documentation Set. It covers how the 
Am29000 C compiler is used. 

These materials can be obtained by writing to: 

Advanced Micro Devices, Inc. 
901 Thompson Place 
P.O. Box 3453 
Sunnyvale, CA 94088-3453 

or by calling (800) 222-9323. 

For questions that cannot be resolved with the current 
literature, further technical support can be obtained by 
writing or calling: 

29K Support Products Engineering 

Mail Stop 561 

5900 E. Ben White Blvd. 

Austin, TX 78741 

(800) 2929-AMD (US) 

0-800-89-1131 (UK) 

0-031-11-1129 (Japan) 

THE EXAMPLE SYSTEM 

The example system used for illustration in this docu- 
ment consists of a generic hardware environment and a 
small software program. The only function of this self- 
contained standalone system is to test a block of mem- 
ory. This section describes how the example system 
works. 



SOFTWARE 

The software is a small program that initializes its oper- 
ating environment and then continuously tests memory. 
It is comprised of a boot module and a C-language mod- 
ule. A flow chart for the complete application is shown in 
Figure 2. 

The main portions of the program are contained in two 
source files: smplboot.s and cprog.c. The smplboot.s 
module is an assembly-language boot program that 
receives control on power up. The C-language program 
cprog.c performs the memory test. 

The tasks performed by smplboot.s are: (1) estab- 
lish the execution environment, (2) set up a block of 
initialized data in instruction/data RAM (using a rou- 
tine generated by the ROMCOFF utility), (3) call the 
main program cprog.c, and (4) evaluate the results of 
the memory test. If the test fails, smplboot.s halts the 
processor. 

The cprog.c program tests a 32K byte block of RAM, 
using a simple binary write and read test. Then, cprog.c 
checks the validity of the initialized data section in 
instruction/data RAM. After each successful comple- 
tion, a flag is returned to smplboot.s, which increments 
a counter. If a test fails, cprog.c returns the address of 
the failing memory location. A memory map of the appli- 
cation is shown in Figure 3. 

Three additional files (traps.s, r29k.s, and scregs.def) 
contain the supporting procedures and declarations. All 
of the files in the application are listed in Appendices A 
through E. To actually perform the example, the files 
must be entered onto the host system. 

HARDWARE ENVIRONMENT 

The application runs on the Standalone Execution 
Board (STEB), manufactured by STEP Engineering. 
Figure 4 shows a block diagram of the STEB, which 
contains an Am29000, some RAM and ROM, and two 
serial ports (provided by an 8530 serial communications 
controller). 

A few important features of the STEB should be noted. 
First, data can be passed between the instruction and 
data buses via a bi-directional swap buffer. The swap 
buffer permits code to be downloaded into the instruc- 
tion RAM area via the ADAPT29K. It also allows data 
objects in the instruction ROM space to be read as data. 

Second, the instruction ROM space can contain RAM 
devices or ROM devices. RAM devices should be 
installed when working with the ADAPT29K (see 
Appendix F), so that code can be downloaded into the 
instruction ROM space. 



3-83 



29K Family Application Notes 



( smplboot.s J 



Initialize 
Am29000 



Transcribe 
Initialized Data 




11966A-02 



Figure 2. Flow Chart of the Example Application 
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Figure 3. Memory Map of the Example Application 
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Figure 4. Bioclt Diagram of the STEB 



PREPARING AN EXECUTABLE 
OBJECT MODULE 

Preparing the executable object module involves sev- 
eral steps. Typically, the steps are repeated frequently 



because errors must be corrected and revisions must be 
made. The process can be automated by placing the 
commands in a DOS batch file. Listing 1 shows the 
batch file sc.bat, which is used in the example applica- 
tion. Following the listing, each step is explained. 



3-86 



Preparing PROMs Using the Am29000 Development Tools 



Listing 1. The Batch File sc.bat 



@echo off 

echo ********************************************************* 

echo "Compiling cprog.c and Assembling the .s files" 

echo ********************************************************* 

hc2 9 -c -w cprog.c > cprog.e 

hc2 9 -S -Hasm cprog.c > cprog.e 

as2 9 -1 > smplboot.lst -o smplboot.o smplboot . s 

as29 -1 > traps. 1st -o traps. o traps. s 

as29 -1 > r29K.lst -o r29k.o r29k.s 

echo ********************************************************* 
echo "Linking object files with libraries and generating" 
echo "executable object module for ROMCOFF" 

echo ********************************************************* 
ld2 9 -c stepl.cmd -o stepl.out -f tx -m > outlink.map 

echo ********************************************************* 
echo "Using ROMCOFF" 

echo ********************************************************* 
c:\29k\bin\romcoff -tlb stepl.out rom.o 

echo ********************************************************* 
echo "Linking object files with libraries and generating" 
echo "final executable object module" 

echo ********************************************************* 
as2 9 -1 > smplboot.lst -DRAMINIT -o smplboot.o smplboot . s 
ld2 9 -c step2.cmd -o step2.out -f tx -m > step2 .map 

echo ********************************************************* 
echo "Converting executable object code to downloadable format" 
echo ********************************************************* 

c:\29k\bin\btoa step2s.out sea 

echo ********************************************************* 
echo "Converting executable into PROM-programmable format" 
echo ********************************************************* 
coff2hex -c t -m -p 27512 step2e.out > step2.e 
echo on 



COiVIPILING CPROG.C AND ASSEIVIBLING 
THE .S FILES 

The first group of operations in the batch file obtains 
relocatable object modules from the source files. The 
C-language source file cprog.c is compiled by invoking 
the HighC29K compiler with the command line: 

hc29 -c -w cprog.c 

HighC29K replaces the symbolic instructions in the 
source file with equivalent machine-code routines. Then 
a relocatable object file (cprog.c) is produced, as 
shown in Figure 5. 



The parameter -w suppresses warning messages, limit- 
ing the output to containing only errors; the -c parame- 
ter instructs the assembler to produce the object file. 
Note that a second compilation is performed with the 
-Hasm flag on. This produces an assembly listing (.s 
file) only. 

Next, the ASM29K assembler is used to assemble the 
modules smpiboot.s, traps.s, and r29l<.s. This in- 
volves replacing assembly-language symbolic instruc- 
tions in the source file with the corresponding machine 
instruction code. To assemble smpiboot.s and obtain a 
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relocatable object tile, the following command line can 
be entered: 



as2 9 -1 > smplboot.lst -o 
smplboot.o smplboot.s 



Same line. 



A relocatable object file (smplboot.o) and a listing file 
(smplboot.lst) are produced from the assembly. All 
assembly-time errors are directed to the std.out. The 
operation is shown in Figure 6. The same operation is 
done on traps.s and r29k.s. 

Linking 

Once the relocatable object files have been made, they 
must be linked (i.e., assigned physical addresses). This 
is done using the ASM29K linker, which allows one or 
more object files from either the assembler or the 



compiler to be linked together into a single executable 
object file. 

The object modules are linked by entering the command 
line: 



ld2 9 -c stepl.cmd -o stepl.outl 
-f tx -m > outlink.map J 



Same line. 



Using the command file step1.cmd (see Listing 2), the 
files smplboot.o, r29K.o, and traps.o are linked with 
cprog.o into a single, non-relocatable object file called 
scout . A reference to where each module was placed is 
put in the map file stepl.map. Any error messages are 
sent to the std.out. The linking process produces a map 
file that lists the local symbol table, external symbols, 
and the cross-reference. This type of output is a good 
reference to the entire application program. 
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Figure 5. Compiling cprog.c 
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Figure 6. Assembling smplboot.s 



Listing 2. The Linker Command File step1.cmd 



u 

O 


— 




O ORDER .text=OxO 

O ORDER .bss=0xl00400 







O ORDER .ciata=0xl00420 





O PUBLIC _MSTACK=0xlf7fc 






O PUBLIC _RSTACK=Oxlfffc 

Q load smplboot .0, r29k.o, traps. 







load cprog.o 





° load c:\29k\lib\libmw.lib 







TRANSFERRING CODE FROM ROM TO RAM: 
ROMCOFF 

The smplboot.s file contains a section of initialized data 
that must be loaded into instruction/data RAM and 
tested by the application program. This could be accom- 
plished by writing many lines of const, consth, and 
storem instructions into the smplboot.s file. Another 
method is to use the ROMCOFF utility. 

The ROMCOFF utility transforms user-specified sec- 
tions of an Am29000 program into a stream of instajc- 



tions that will perform the transcription. From a fully 
linked, executable Am29000 program, the ROMCOFF 
utility generates a COFF output file containing in- 
itializers that will establish the image of an executable 
COFF input file in instruction/data RAM. The output file 
contains one section, Rljext, within which is one rou- 
tine, RAMInit. The output file can then be linked with 
other relocatable modules that will remain in Instruction 
ROM, to produce a single non-relocatable module for 
programming PROMs. 
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ROMCOFF can be used to transcribe entire sections of 
code into instmction/data RAM. Then, once the applica- 
tion's boot program has finished preparing the environ- 
ment, it transfers control to the transcribed program in 
instruction/data RAM. This allows the code to be 
executed out of high-speed RAM devices, which are 
frequently more cost effective than high-speed PROMs. 
See Figure 7. 

In the example program, only a section of initialized data 
In smplboot.s is transferred to RAM. ROMCOFF 
creates a relocatable object module that transcribes the 
data sections to RAM when the following command line 
is entered: 

romcoff -tlb stepl.out rom.o 

The linked output file stepl.out is made into the file 
rom.o. Only the data section is output, because of the 
ROMCOFF options -tlb, which specify that the text, 
literal, and bss sections should be ignored. 

The output from ROMCOFF (rom.o) contains only code 
to transcribe data sections. It must be re-linked with the 



object files to produce a final absolute object module. 
First, the code in smplboot.s, which contains a call to 
the Rljext section, must be assembled to include the 
conditional assembly statements. 

To assemble smplboot.s so that it will contain the call, 
enter: 

as29 -1 > smpiboot.ist -draminitI Same 
-o smplboot . o smplboot . s J 

The -D option defines RAMInit so that conditional as- 
sembly statements in the source file will be assembled. 
The statements include a definition of RAMInit, and a 
call to it. Then, all of the object modules can be linked 
with rom.o as follows: 

ld2 9 -c step2.cmd -o step2.out 1 Same 
-f tx -m > step2.map J line. 

A second linker command file is used because rom.o 
must identified to the linker (see Listing 3). 
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Figure 7. Using ROMCOFF 
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Listing 3. The Linker Command File step2.cmci 



—yj 

O 


o 


O ORDER .text=OxO,RI_text 
O ORDER .bss=0xl00400 


o 
o 
o 


O ORDER .data=0xl00420 


o 


Q PUBLIC _MSTACK=0xlf7fc 


o 


O PUBLIC _RSTACK=Oxlfffc 
Q load smplboot.o 


o 
o 
o 


O load rom.o 


o 


Q load r29k.o, traps. o 


o 
o 


O load cprog.o 


o 


° load c:\2 9k\lib\libmw.lib 


o 
o 


O 
O 

o 
o 
o 
o 
o 

/-> 


o 
o 
o 
o 
o 
o 
o 

r> — 



DOWNLOADING TO THE ADAPT29K 

Once the final executable object module is created, the 
example program can be downloaded to the target 
system and tested using the ADAPT29K. 



USING BTOA 

The BTOA utility creates an ASCII COFF output from 
the input file. Although the ADAPT29K can handle 
Tektronics® or Motorola® hex files, using the BTOA util- 
ity to make the ASCII hex file has several advantages. 



Most importantly, BTOA encodes the input file into 
(7-bit) ASCII using a compact base-5 scheme that limits 
file expansion to only 25 percent, as opposed to 1 50 per- 
cent for standard hex formats. Hence, the resulting out- 
put file is smaller, and consequently quicker to transfer. 
Also, BTOA maintains the ASCII COFF format, rather 
than converting it to absolute addresses. 

As shown in the sc.bat batch file, BTOA produces the 
output file sea and is invoked by: 

btoa step2s.out sea 



OOOOOOOOR 


C6400200 


MFSR 


GR64,CPS 


00000004R 


03fb41ff 


CONST 


GR65,0xFBFF 


00000008R 


90404041 


AND 


GR64,GR64,GR65 


OOOOOOOcR 


ce000240 


MTSR 


CPS,GR64 


OOOOOOIOR 


03004000 


CONST 


GR64,0x0 


00000014R 


ce000040 


MTSR 


VAB,GR64 


00000018R 


0300403f 


CONST 


GR64,0x3F 


OOOOOOlcR 


ce000740 


MTSR 


RBP,GR64 
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Figure 8. List Memory Display 
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Listing 4. Results of "End Execution" Command List 



> d 400,420 

00000400 00000000 00000000 00000000 00000000 
00000410 00000000 00000000 00000000 00000000 
00000420 00000000 



TESTING THE EXAMPLE PROGRAM WITH THE 
ADAPT29K 

Once the object module has been translated using the 
BTOA utility, it can be downloaded to the target using 
ADAPT29K. For use with ADAPT29K. the STEB should 
be configured as indicated in Appendix F. 

To download the file, communication must be estab- 
lished with the ADAPT29K. On a PC, this is done by 
invoking the terminal emulator program (for example, 
CrossTalk®), establishing communication with the 
ADAPT29K, and entering (note that # is the ADAPT29K 
monitor prompt): 

# ya c , r 

The Y (load a file to memory) command prepares the 
ADAPT29K to receive an ASCII-encoded file from the 
DCE port. Then, the emulator must be instructed to 
transmit the file (for example, se sea when using 
CrossTalk). After the code has been downloaded, and 



the next prompt has appeared, the contents of the 
instmction ROM can be verified by entering: 

# 1 Or 

The ADAPT29K should respond to the L (list memory) 
command with the display shown in Figure 8. The loca- 
tions starting at 0x400 in instmction/data RAM contain 
the status of the test and number of successful loops, 
respectively. Which location actually contains which 
variable is a decision made by the linker, and must be 
determined by inspection. 

To check these locations automatically when the execu- 
tion stops, set up an "end execution" command list by 
entering: 

# e d 400,420; 

The list is executed on entry. It should appear as shown 
in Listing 4. 



GR080 


00000000 


00000000 


00000000 


00000000 


00000000 


00000000 


00000000 


00000000 


GR088 


00000000 


00000000 


00000000 


00000000 


00000000 


00000000 


00000000 


00000000 


GR096 


00104al8 


00000000 


00000000 


00000000 


00000000 


00000000 


00000000 


00000000 


GR104 


00000000 


00000000 


00000000 


00000000 


00000000 


00000000 


00000000 


00000000 


GR112 


00000000 


00000000 


00000000 


00000000 


80000020 


000095d9 


00100400 


00000095 


GR120 


ffffffff 


80000000 


00000000 


00000000 


00000000 


0001f7fc 


OOOOOfff 


06050101 


LROOO 


00000928 


OOOlfffc 


00100414 


00108414 


OOOOOOfO 


OOOlfffc 


00000000 


00000000 


LR008 


00000000 


00000000 


00000000 


00000000 


00000000 


00000000 


00000000 


00000000 


LR016 


00000000 


00000000 


00000000 


00000000 


00000000 


00000000 


00000000 


00000000 


LR024 


00000000 


00000000 


00000000 


00000000 


00000000 


00000000 


00000000 


00000000 


LR032 


00000000 


00000000 


00000000 


00000000 


00000000 


00000000 


00000000 


00000000 


LR040 


00000000 


00000000 


00000000 
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Figure 9. Key Registers Display 
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Priorto starting the test, it is a good practice to reset the 
system by using the P reset command: 

# p reset 

To verify the condition of the system before execution, 
the X (Display Key Registers) command is entered as: 

# X 

This will result in a display as shown in Figure 9. The 
special-purpose protected registers can be checked 
using the XP (display protected registers) command. 
The display appears as shown in Figure 10. 

To execute the program starting from address in 
instruction ROM, the G (go — start execution) command 
is used: 

# g Or 

During execution, the status of the program can be 
checked by invoking the previously defined "end execu- 
tion" command list. 



Enter: 

# e 

The display will be similar to that shown in Figure 1 1 . 
The precise display in any given situation, particularly 
the loop count stored in location 40CD is dependent 
on the exact time elapsed between the start execution 
and the entry of the E command. At another time, it may 
appear as shown in Figure 12. 

The state of the processor can be checked using the C 
(check execution state command): 

# c 

When the processor is running, ADAPT29K displays: 

Ain29000 is Running. 



# xp 





CA 


IP 


TE 


TP TU FZ LK RE WM PD 


PI 


SM 


IM 


DI 


DA 


CPS: 











10 10 1 


1 


1 





1 


1 


OPS: 











10 10 1 


1 


1 





1 


1 


VAB 






CFG 


: PRL VF RV BO CP CD 












0000 








01 1 1 1 













CHA CHD CHC: CE CNTL CR LS ML ST LA TF TR NN CV 
00104al4 00000000 00 00 1 79 1 

REP: BF BE BD BC BE BA B9 B8 B7 B6 B5 B4 B3 B2 Bl BO 
0000000000111111 

TCV TR: OV IN IE TRV PCO PCI PC2 MMU: PS PID LRU 

000000 110 000000 00000a34 00000a30 00000a2c 00 

# 11966A-10 

Figure 10. Protected Registers Display 
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> d 400,420 

00000400 009595d9 00000000 00108414 0000002d 

00000410 00100414 00000000 00000000 00000000 

00000420 00000000 



Figure 11. Check Status Display 



11966A-11 



> d 400,420 
00000400 
00000410 
00000420 

# 



009595d9 00000000 00108414 OOOOOOel 
00100414 ffffffff ffffffff ffffffff 
ffffffff 



Figure 12. Second Check Status Display 
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PREPARING PROMs 

Once the absolute object file has been prepared, it must 
be transferred to the media from which the code will be 
executed. Often, this medium is a PROM set. Ivlost 
PROM programmers require their input to be in an 
ASCII hex format, so a translation normally is performed 
before sending the program to the PROM programmer. 

MAKING HEX FILES: C0FF2HEX 

The C0FF2HEX utility produces a 32-bit ASCII hex file 
in either the Motorola S3 orTektronics Extended format. 
Both of these formats are accepted by most PROM 
programmers, as well as the ADAPT29K. Note that the 
ADAPT29K requires the file to be one module, rather 
than being divided into separate modules by part size 
(see the options of the C0FF2HEX utility). 



In sc.bat, C0FF2HEX is invoked by entering 

coff2hex -c t -m -p 27512 



step2e.out > sccoff.e 



Same line. 



This produces 8-bit wide modules that will fit into a 
27512 EPROM (-p option). The format is Motorola S3 
(-m option), and will include only the text sections (-c t 
option). 

The resulting file(s) will be named a.aOO, a.aOS, a.a16, 
and a.a24, indicating which bytes of the word they 
represent. If the file is larger than the capacity of the part 
size specified, additional sets of four will be generated 
with filenames a.bOO, a.b08, a.b16, a.b24, and so on, 
with further sets having a corresponding nomenclature. 
Once generated, the files can then be transmitted to a 
PROM programmer. 



PROGRAMMING THE PROMS 

A PROM programmer is used to "burn" the binary object 
file into PROM devices. Many types of PROM program- 
mers are available. The Data I/O Unisite® PROM 
programmer is used in the following example. 

Assuming an object module had been created as 
described in the first part of this document (and a set 
of Motorola S3 modules were obtained using 
C0FF2HEX), the following procedure could be used to 
create a PROM set. 

1. Turn on the PROM programmer. Make sure the 
algorithm disk is properly inserted in the lower front 
slot. 

2. Once the power-up sequence and diagnostics have 
completed, a screen should appear on the attached 
terminal. If there is no terminal, or the screen does 
not appear, refer to the set-up section of the user's 
manual for the PROM programmer. 

3. Make sure a host system is attached. In this exam- 
ple, the use of a PC is assumed. At the PC, set the 
C0M1 serial port of the PC to 9600 baud, no parity, 
8-bit bytes, and one stop-bit by entering: mode 
comi :96,n,8,1 . On the PROM programmer, select 
"Configure System," followed by "Edit," and then 
"Serial I/O." Make sure the remote port parameters 
are set properly. 

4. The program will be placed in AMD 2751 2 PROMs. 
To inform the PROM programmer, choose "Select 
Device," "3" (AMD), and "25" (27512). 
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5. It is a good idea to clear the PROM programmer's 
memory before downloading data. This ensures 
that the PROMs do not become programmed with 
leftover data from a previous operation, which may 
cause troublesome errors. To clear the memory, 
select "Fill Menrrary." Enter 00 to 7FFFF as the 
address range, and FF as the data. 

6. The PROM programmer must know the format of 
the incoming data. Select "Transfer Data," followed 
by "Format Select." Enter "95" for Motorola S3 
Record. 

7. Select "Load Device" on the programmer. On the 
PC, enter: 

copy a.aOO coml : 

This causes the lowest 8 bits of the application to be 
transmitted to the PROM programmer, which will 
load the data into its memory. 



Properly insert a PROM into the ZIF socket on the 
PROM programmer and engage the locking mecha- 
nism. Select "Program Device" option on the PROM 
programmer. 

Once the PROM has been burned, remove it and 
label it with the program name, range of bits, 
version, and date. Then, repeat steps 7-9 using the 
files a.a08 through a.a24. If a larger program is 
used, it may be necessary to repeat steps 7-9 using 
modules a.bOO, a.bOB, a.b16, a.b24, and so on. 
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APPENDIX A: smplboot.s 





.extern 


r29k_init 




.extern 


_main 




.extern 


V_SPILL,V_FILL 




.extern 


spill, fill 




.extern 


_RSTACK,_MSTACK 




.equ 


ROM_TH, 0x2 




.equ 


RSC_SIZE, 0x200 




.equ 


TBM_SIZE, 0x20000 




.include 


"scregs .def" 




.data 






.word 


[201170 




.comm 


mtp_count, 4 




.text 






.ifdef 


RAMINIT 




.extern 


RAMInit 




.endif 






.global 


start 


start: 








mfsr 


tmp0,CPS 




const 


tmpl,OxFBFF 




and 


tmpO, tmpO, tmpl 




mtsr 


CPS,tmpO 




const 


tmpO,0 




mtsr 


VAB, tmpO 




const 


tmpO,Oxll 




mtsr 


CFG,tmpO 




const 


tmp2,0 




const 


tmpO, 




consth 


tmpl,TBM_SIZE 




sub 


tmpl, tmpl, tmpO 




srl 


tmpl, tmpl, 2 




sub 


tmpl, tmpl, 2 


mem_00 : 


store 






jmpfdec 


tmpl,mem_00 




add 


tmpO,tmpO,4 




const 


tmpO, 256-2 




const 


tmpl,illtrap+0x2 




consth 


tmpl, illtrap 




const 


tmp2,0 


vtd_init 








store 


0, 0,tmpl,tmp2 




jmpfdec 


tmpO, vtd_init 




add 


tmp2,tmp2, 4 




const 


tmpO, spilltrap+ROM_TH 




consth 


tmpC, spilltrap 




const 


tmpl, V_SP ILL 




sll 


tmpl, tmpl, 2 




store 


0,0,tmpO,tmpl 




const 


tmpO, f illtrap+ROM_TH 




consth 


tmpO, f illtrap 




const 


tmpl,V_FILL 




sll 


tmpl, tmpl, 2 




store 


0,0,tmpO,tmpl 




const 


rfb,_RSTACK 




consth 


rfb,_RSTACK 




const 


tmpO,RSC_SIZE 




sub 


rab, rfb, tmpO 




sub 


rsp,rfb, 0x8 




const 


msp,_MSTACK 




consth 


msp,_MSTACK 




add 


lrl,rfb,0 




const 


tmp0,r29k_init 




consth 


tmp0,r29k_init 




calli 


Ir0,tmp0 




nop 






.ifdef 


RAMINIT 



assembly module 

C module 

Linker definable V_SPILL and V_FILL vector numbers 

spill and fill procedure 

Link time definable stack pointer assignments 

Spill and fill trap interface do truly reside in ROM space 

Default reg_stack_cache usage=512 

32K*4=128kb of Inst/RAM size 



if RAMINIT Flag on 
make RAMInit available 



; Read CPS 

; Clear FZ bit 

; Update CPS 

; Set VAB pointing to LOW memory 

; Set VF=1, i.e.. Vector table scheme and CD=1, 

; i.e.. Branch Target Cache is disabled 

; Write Data pattern = 0x00000000 

; Low memory address 

; High memory address 

; Get address difference 

; Get word count from diff value 

; adjustments for jmpfdec instr 

; fill TB_memory with all zeros 

0,0,tmp2,tmp0 



Total of 256 vector table entries 
ROM based illegal trap handlers 
address, by default 

fill vector table with default 
trap handlers 



get spill trap entry point 

get spill trap vector number 

generate vect number location 

store address of trap handler into vector table 

get fill trap entry point 

get fill trap vector number 

generate vect number location 

store address of trap handler into vector table 

Set RFB 

0x200=512 bytes ie 128*4 
Set RAB=RFB-512 
Set RSP=RFB-8 
Set MSP 

Set Irl to RFB 



; call procedure to init 29K registers 
; if RAMINIT on. 
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const 


tmpO,RAMInit 


consth 


tmpO,RAMInit 


calli 


gr96,tmp0 


.else 




nop 




nop 




nop 




.endif 




nop 




const 


tmpO,exec 


consth 


tmpO,exec 


mtsrim 


OPS, 0x172 


mtsrim 


CPS, 0x573 


mtsr 


PCl.tmpO 


add 


tmp0,tmp0,4 


mtsr 


PC0,tmp0 


xor 


tmpO, tmpC, tmpO 


iret 




exec: 




const 


lrO,_main 


consth 


IrO, main 


calli 


Ir0,lr0 


nop 




sll 


gr97,gr64,0 


sll 


gr98,gr65,0 


sll 


gr99,gr66,0 


const 


grG4,mtp_count 


consth 


gr64,mtp_count 


load 


0,0,gr65,gr64 


cpeq 


gr67,gr96,0 


jmpt 


gr67, again 


nop 




halt 




again: 




add 


gr65,gr65, 1 


store 


0,0,gr65,gr64 


sll 


gr64,gr97,0 


sll 


gr55,gr98,0 


sll 


gr6e,gr99,0 


jmp 


exec 


nop 




spilltrap: 




mfsr 


tpcPCl 


const 


tmpO, spill 


consth 


tmpO, spill 


mtsr 


PCl.tmpO 


add 


tav,tmpO,tmpO+4 


mtsr 


-PCO.tmpO 


iret 




filltrap: 




mfsr 


tpcPCl 


const 


tmpO.fill 


consth 


tmpO,fill 


mtsr 


PCl,tmpO 


add 


tav,tmpO,tmpO+4 


mtsr 


PCO.tmpO 


iret 




illtrap: 




halt 




.end 





set up RAMInit call 

and do the call 

make sure code takes same 

number of locations 

regardless of RAMINIT condition 

in case we did calli 

get target application task address 

RE=1, PI=1, PD=1, SM=1 and DI=1 
Set Target application Task address 



Any additional regs clean up 

Give control to application via IRET 

get C-callable routine entry point 

make the call 

Save user global registers grG4 
through grSG 

get address of memory test pass 
count recorder 
get current count so far 
check for memory test pass? 
true then run test again 

false halt further memory testing 

bump mtp_count by 1 

update in memory also 

Restore user global registers gr64 

through gr66 

run the memory test once again 



save return address in tpc 

get spill procedure entry point 

fill Am29000 pipeline target address 

fill Am29000 pipeline with target address+4 

save return address in tpc 
get fill procedure entry point 

fill Am29000 pipeline target address 

fill Am29000 pipeline with target address+4 
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APPENDIX B: cprog.c 

#define MT_PASSED 

#define SOLID_ONES -1 

((define SOLID_ZEROS 

((define MT_BLK_SIZE 327 68 

((define WORD_SIZE 4 

#define INIT_DATA 170 

#define MEM_BLOCK 1056 

#define NIT_DATA_BASE 1280 

#define INIT_DATA_SIZE 15 

int *mt_sts; 

int lm_addr,hm_addr; 

int initdata; 

int *mem_test () ; 

ma i n ( ) 
{ 

lm_addr = INIT_DATA_BASE; 

hm_addr = INIT_DATA_BASE+MT_BLK_SIZE/WORD_SIZE; 

initdata = MEM_BLOCK; 

mt_sts = mem_test (lm_addr,hm_addr, initdata) ; 
) 

int *mem_test (low, high, initd) 

int *low, *high, *initd; 

{ 

int *addr; 

/* Solid Ones test */ 
for (addr=low; addr<=high; addr++) 

*addr = SOLID_ONES; 
for (addr=low; addr<=high; addr++) 
if{*addr != SOLID_ONES) 
return (addr) ; 

/* Solid Zeros test */ 
for (addr=low; addr<=high; addr++) 

*addr = SOLID_ZEROS; 
for (addr=low; addr<=high; addr++) 
if(*addr != SOLID_ZEROS) 
return (addr) ; 

for (addr=initd;addr<in itd+INIT_DATA_SIZE;addr+) 
if(*addr != INIT_DATA) 
return (addr) ; 
return (MT PASSED) ; 
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APPENDIX C: r29k.s 





•macro 12 


9kGPR,gpr_nu 






xor 


gpr_nu, gpr_nu, gpr_ 


_nu 




.endm 








.macro 


I29kSPR,spr_nu 






mtsrim 


spr_nu, 






.endm 








.macro 


I29kMPR, tlbr_nu 






const 


gr65, tlbr_nu 






mttlb 


gr65,gr66 






.endm 








.text 








.global 


r29k_init 




9k_ 


_init: 








l29kGPR 


gr67 






I29kGPR 


gr68 






l29kGPR 


gr69 






I29kGPR 


gr70 






l29kGPR 


gr71 






l29kGPR 


gr72 






I29kGPR 


gr73 






I29kGPR 


gr74 






l29kGPR 


gr75 






I29kGPR 


gr76 






l29kGPR 


gr77 






l29kGPR 


gr78 






l29kGPR 


gr79 






I29kGPR 


grSO 






l29kGPR 


grSl 






l29kGPR 


gr82 






l29kGPR 


gr83 






I29kGPR 


gr84 






l29kGPR 


grSS 






l29kGPR 


gr86 






I29kGPR 


gr87 






I29kGPR 


gr88 






l29kGPR 


gr89 






l29kGPR 


gr90 






l29kGPR 


gr91 






I29kGPR 


gr92 






I29kGPR 


gr93 






l29kGPR 


gr94 






I29kGPR 


gr95 






l29kGPR 


gr96 






I29kGPR 


gr97 






l29kGPR 


gr98 






l29kGPR 


gr99 






l29kGPR 


grlOO 






I29kGPR 


grlOl 






l29kGPR 


grl02 






I29kGPR 


grlOS 






I29kGPR 


grl04 






l29kGPR 


grl05 






l29kGPR 


grl06 






l29kGPR 


grl07 






I29kGPR 


grl08 






l29kGPR 


grl09 






l29kGPR 


grllO 






I29kGPR 


grlil 






I29kGPR 


grll2 






I29kGPR 


grlia 






l29kGPR 


grll4 






l29kGPR 


grllS 






I29kGPR 


grll6 






I29kGPR 


grll7 






I29kGPR 


grll8 






l29kGPR 


grll9 






l29kGPR 


grl20 






I29kGPR 


grl21 





; Set GR67-GR127 to known state 
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I29kGPR 


grl22 


I29kGPR 


grl23 


l29kGPR 


grl24 


I29)cGPR 


lr2 


I29kGPR 


lr3 


I29kGPR 


lr4 


I29kGPR 


IrS 


I29kGPR 


lr6 


l29kGPR 


lr7 


I29kGPR 


lr8 


I29kGPR 


lr9 


I29kGPR 


IrlO 


I29kGPR 


Irll 


I29kGPR 


lrl2 


I29kGPR 


lrl3 


I29kGPR 


lrl4 


I29kGPR 


lrl5 


I29kGPR 


lrl6 


I29kGPR 


lrl7 


I29kGPR 


Xrl8 


I29kGPR 


lrl9 


I29kGPR 


lr20 


I29kGPR 


lr21 


I29kGPR 


lr22 


I29kGPR 


lr23 


I29kGPR 


lr24 


I29kGPR 


lr25 


I29kGPR 


lr26 


I29kGPR 


lr27 


I29kGPR 


lr28 


I29kGPR 


lr29 


I29kGPR 


lr30 


I29kGPR 


lr31 


I29kGPR 


lr32 


I29kGPR 


lr33 


I29kGPR 


lr34 


I29kGPR 


lr35 


I29kGPR 


lr3G 


I29kGPR 


lr37 


I29kGPR 


lr38 


I29kGPR 


lr39 


I29kGPR 


lr40 


I29kGPR 


lr41 


I29kGPR 


lr42 


I29kGPR 


lr43 


I29kGPR 


lr44 


I29kGPR 


lr45 


I29kGPR 


lr46 


I29kGPR 


lr47 


I29kGPR 


lr48 


I29kGPR 


lr49 


I29kGPR 


IrSO 


I29kGPR 


IrSl 


I2 9kGPR 


lr52 


I29kGPR 


lr53 


I29kGPR 


lr54 


I29kGPR 


lr55 


I29kGPR 


lr56 


I29kGPR 


lr57 


I29kGPR 


lr58 


I29kGPR 


lr59 


I29kGPR 


ireo 


I29kGPR 


IrGl 


I29kGPR 


lr62 


I29kGPR 


lr63 


I29kGPR 


lr64 


I29kGPR 


lr65 


I29kGPR 


lr66 


I29kGPR 


lr67 


I29kGPR 


lrG8 



; Set Ir2-lrl27 to known state 
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I29kGPR 


lr69 


I29kGPR 


lr70 


I29kGPR 


lr71 


I29kGPR 


lr72 


I29kGPR 


lr73 


I29kGPR 


lr74 


I29kGPR 


lr75 


I29kGPR 


lr76 


I29kGPR 


lr77 


I29kGPR 


lr78 


I29kGPR 


lr79 


I29kGPR 


lr80 


I29kGPR 


lr81 


I29kGPR 


lr82 


I29kGPR 


lr83 


I29kGPR 


lr84 


I29kGPR 


lr85 


I29kGPR 


lr86 


I29kGPR 


lr87 


I29kGPR 


lr88 


I29kGPR 


lr89 


I29kGPR 


lr90 


I29kGPR 


lr91 


l29kGPR 


lr92 


I29kGPR 


lr93 


I29kGPR 


lr94 


I29kGPR 


lr95 


I29kGPR 


lr96 


I29kGPR 


lr97 


I29kGPR 


lr98 


I29kGPR 


lr99 


I29kGPR 


IrlOO 


I29kGPR 


IrlOl 


I2 9kGPR 


lrl02 


I29kGPR 


IrlOS 


I29kGPR 


lrl04 


I29kGPR 


lrl05 


I29kGPR 


lrl06 


I29kGPR 


lrl07 


I29kGPR 


lrl08 


I29kGPR 


lrl09 


I29kGPR 


IrllO 


I29kGPR 


Irlll 


I29kGPR 


lrll2 


I29kGPR 


Irlia 


I29kGPR 


lrll4 


I29kGPR 


lrll5 


I29kGPR 


lrll6 


I29kGPR 


lrll7 


I29kGPR 


lrll8 


I29kGPR 


lrll9 


I29kGPR 


lrl20 


I29kGPR 


lrl21 


I29kGPR 


lrl22 


I29kGPR 


lrl23 


I29kGPR 


lrl24 


I29kGPR 


lrl25 


I29kGPR 


lrl26 


I29kGPR 


lrl27 


I29kSPR 


OPS 


I29kSPR 


CHA 


I29kSPR 


CHD 


I29kSPR 


CHC 


l29kSPR 


RBP 


I29kSPR 


TMC 


l29kSPR 


TMR 


l29kSPR 


MMU 


l29kSPR 


LRU 


l29kSPR 


IPC 



Set spl,sp4-sp9 to known state = 
Set spl3 and spl4 to known state = 



Set spl28-135 to known state = 
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l29kSPR 


IPA 


l29kSPR 


IPB 


l29kSPR 


Q 


l29kSPR 


ALU 


l2 9kSPR 


BP 


l29kSPR 


FC 


l29kSPR 


CR 


const 


gr6G,0 


l29kMPR 





l29kMPR 


1 


l29kMPR 


2 


l29kMPR 


3 


l29kMPR 


4 


l29kMPR 


5 


l29kMPR 


6 


l29kMPR 


7 


I29kMPR 


8 


l29kMPR 


9 


l29kMPR 


10 


l29kMPR 


11 


I29kMPR 


12 


I29kMPR 


13 


I29kMPR 


14 


l29kMPR 


15 


I2 9kMPR 


16 


l29kMPR 


17 


l2 9kMPR 


18 


I29kMPR 


19 


l29kMPR 


20 


l29kMPR 


21 


l29kMPR 


22 


l29kMPR 


23 


l29kMPR 


24 


l29kMPR 


25 


l29kMPR 


26 


l29kMPR 


27 


l29kMPR 


28 


l29kMPR 


29 


I2 9kMPR 


30 


I29kMPR 


31 


l29kMPR 


32 


l29kMPR 


33 


l29kMPR 


34 


l29kMPR 


35 


l29kMPR 


36 


l29kMPR 


37 


l29kMPR 


38 


l29kMPR 


39 


l29kMPR 


40 


I29kMPR 


41 


l29kMPR 


42 


l29kMPR 


43 


l29kMPR 


44 


I29kMPR 


45 


l29kMPR 


46 


l29kMPR 


47 


l29kMPR 


48 


l29kMPR 


49 


I29kMPR 


50 


l29kMPR 


51 


l29kMPR 


52 


l29kMPR 


53 


I29kMPR 


54 


l29kMPR 


55 


l29kMPR 


56 


I29kMPR 


57 


I29kMPR 


58 


I29kMPR 


59 


I29kMPR 


60 


I29kMPR 


61 



; Set tr0-trl27 to known state = 
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I29kMPR 62 

I29kMPR 63 

I29kMPR 64 

I29kMPR 65 

I29kMPR 66 

I29kMPR 67 

I29kMPR 68 

I29kMPR 69 

I29kMPR 70 

I29kMPR 71 

I29kMPR 72 

I29kMPR 73 

I29kMPR 74 

I29kMPR 75 

I29kMPR 76 

I29kMPR 77 

I29kMPR 78 

I29kMPR 79 

I29kMPR 80 

I29kMPR 81 

I29kMPR 82 

I29kMPR 83 

I29kMPR 84 

I29kMPR 85 

I29kMPR 86 

I29kMPR 87 

I29kMPR 88 

I29kMPR 89 

I29kMPR 90 

I29kMPR 91 

I29kMPR 92 

I29kMPR 93 

I29kMPR 94 

I29kMPR 95 

I29kMPR 96 

I29kMPR 97 

I29kMPR 98 

I29kMPR 99 

I29kMPR 100 

I29kMPR 101 

I29kMPR 102 

I29kMPR 103 

I29kMPR 104 

I29kMPR 105 

I29kMPR 106 

I29kMPR 107 

I29kMPR 108 

I29kMPR 109 

I29kMPR 110 

I29kMPR 111 

l29kMPR 112 

I29kMPR 113 

I29kMPR 114 

l29kMPR 115 

l29kMPR 116 

I29kMPR 117 

l29kMPR 118 

I29kMPR 119 

l29kMPR 120 

l29kMPR 121 

l29kMPR 122 

l29kMPR 123 

l29kMPR 124 

l29kMPR 125 

I29kMPR 126 

l29kMPR 127 

jmpi IrO ; return to caller 

const gr65,0 

.end 
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APPENDIX D: traps.s 



TRAPS. S 



.text 

.global spill, fill 
.include "scregs.def 



spill: 



sub 


tav, rab, rsp 


sub 


rfb, rfb,tav 


srl 


tav, tav, 2 


sub 


tav, tav, 1 


mtsr 


CR,tav 


storem 


0,0,lrO,rfb 


sll 


rfb, rab, 


jmpi 


tpc 


sll 


rab, rsp, 


const 


tav, 0x80«2 


or 


tav, tav, rfb 


mtsr 


IPA, tav 


sub 


tav, lrl,rfb 


add 


rab, rab, tav 


srl 


tav, tav, 2 


sub 


tav, tav, 1 


mtsr 


CR,tav 


sll 


tav, rfb, 


sll 


rfb, lrl,0 


jmpi 


tpc 


loadm 


0,0,grO,tav 


.end 





Spill and fill process 



compute spill: lower bound - sp 

adjust rfb pointer 

shift to get number of words 

count is one less 

set Count Remaining register 

spill 

adjust rfb pointer 

return to "caller" 

adjust rab 

local register bit 

in rfb for IPA 

IPA gets starting register number 

compute number of bytes to fill 

push up the allocate bound 

change byte count to word count 

make count zero-based 

set Count Remaining register 

save old rfb 

push up the free bound 

return to "caller" 

fill 



3-104 



Preparing PROMs Using the Am29000 Development Tools 



APPENDIX E: scregs.def 



reg 


rsp, grl 


reg 


msp, grl25 


reg 


rab,grl26 


reg 


rfb,grl27 


reg 


tpc,grl21 


reg 


tav,grl22 


reg 


tmpO, gr64 


reg 


tmpl,gr65 


reg 


tmp2, gr66 



register stack pointer 

memory stack pointer 

register allocate bound 

register free bound 

trap handler argument/temp 

trap handler return address/temp 

temp registers allocations 
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APPENDIX F: CONFIGURATION OF THE 
STEB 



PI 3 Daughter Board 



P12 Daughter Board 



J1 
DCE 



J2 
DTE 



sec 

8530 



SW1 



Clock 

1 23 

[=1 

JP8 



P1 P2 



I LA I I LA I 
P5 P6 P9 



RS- 
232 



LEDs 



SW2 



Target Memory Disable 
1 23 

JP7 



P11 



Power 
P10 



Power 



RS- 
232 



WAIT 



SW3 



U64 



SW4 



Reset 



20 



2l_ll 

Interrupts 
& Traps 



951 3A 
P7 PS 



LA 



LA 



LA 



LA 



5 . ROM Memory Size Jumpers 

'A 4=^* * * * * 



LA 



Am29000 



Am29027 



LA ,J^ 

J □ 



ROM 
Size 



U52 



RAM 
Size 



JP6 JP5 JP4 JP3 JP2 JP1 
* ROM SPACE 



U25 



U26 



U29; 



iU30 







|U27| 







I U28{ 



jU32| 



** INSTR/DATA RAM SPACE 



|U36|, 


-lU40| 


|U44[ 




|U48| 


|U35[ 


;|U39[ 


|U43i 




M 










iU34" 


4U38|, 


|U42| 


|U46| 










1U33|: 

/ ./ ,/ ./ 


:|U37J 


|U4li 




|U45| 



Footnotes: 



* ROM Space Bank 

* ROM SPACE BANK 1 



** RAM Space Bank 
** RAM Space Bank 1 



RAM Space Bank #2 
RAM Space Bank #3 



Note: The STEB uses PROMs (can be MON29K) in ROM space bank 0; otherwise can 
have RAMs in ROM space bank for downloading programs using ADAPT29K. 



11966A-13 



Figure 13. Configuration of the STEB 
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Application Note 



^ 



by Jim Gibbons and Doug Walton 



INTRODUCTION 

Advanced Micro Devices is developing a complete line 
of Am29000^" simulators, hardware-target execution 
vehiicles, and high-level language development tools for 
the Am29000 32-bit Streamlined Instruction Processor. 
These products are designed to support end-users who 
are building embedded system applications based on 
the Am29000 processor. For these users, often there is 
no existing operating system or kernel for their hardware 
design. 

A standalone program runs independently of an operat- 
ing system or other supporting software. As opposed to 
a program that runs under an operating system, a stand- 
alone program is concerned about the characteristics of 
the hardware environment. It controls hardware devices 
and must be aware of the system architecture. Conse- 
quently, the needed executive functions that would be 
performed by an operating system must be designed 
into the application program. 

HOW TO USE THIS APPLICATION NOTE 

This document covers some important issues in pro- 
gramming a standalone Am29000 system. Its purpose 
is not to explain every possible implementation of the 
Am29000, but to present a basic framework from which 
to start development. 

Many sample sections of code are shown. Most are 
taken from the STARTUP files provided on the 
ASM29IC software and are listed in the appendices. 
These files can be consulted for a complete example of 
the boot-up and initialization process. Be aware that the 
range of possible applications in which the Am29000 
can be used is extensive, and it would be impossible to 
provide code that will work in every situation. The code 
samples have been tested in simple, limited applica- 
tions, and should be used as a guideline, not as a 
finished solution. 

The Effects of Memory Organization section discusses 
how the memory organization of an Am29000 system 
affects the design of a standalone program. 
Attention is given to the location from which code is 
executed and how it is accessed. 



The Am29000 Calling Convention section summarizes 
the Am29000 run-time model. Writing good Am29000 
assembly-language programs requires knowledge of 
the run-time model. Code samples used in this applica- 
tion note follow the convention established by the run- 
time model. Understanding the convention eases 
understanding the examples. 

The Writing the Start-up Program section explains how 
an example startup program works. Each task done in 
the process is discussed, from configuring the Am29000 
through calling _main. 

SUGGESTED REFERENCE MATERIALS 

This application note covers fundamental design issues 
involved in implementing a standalone Am29000 sys- 
tem. However, designing a standalone system is a com- 
plex task involving many areas. Knowledge of the 
Am29000 is necessary, as well as the subjects covered 
in the following reference materials. 

Am29000 Streamlined Instruction Processor User's 
Manual, order #10620. It contains details regarding the 
instruction set and register organization of the 
Am29000. 

Am29000 Streamlined Instruction Processor Data 
Sheet, order #09075. It embodies a great deal of infor- 
mation about the Am29000, including distinctive char- 
acteristics, general description, simplified system 
diagram, connection diagram, pin designations and 
descriptions, functional description, absolute maxi- 
mum ratings, operational ranges, DC characteristics, 
switching characteristics and waveforms, and physical 
dimensions. 

Am29000 Memory Design Handbook, order #10623. 
It discusses in detail the tradeoffs in designing an 
Am29000 memory system. Completely covers four 
different approaches to optimizing access speed versus 
cost and memory size. 

Implementation of an Am29000 Stack Cache Applica- 
tion Note. It describes in detail how a stack cache would 
be used in a simple application. 
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These materials can be obtained by writing to: 

Advanced Micro Devices, Inc. 
901 Thompson Place 
P.O. Box 3453 
Sunnyvale, CA 94088-3453 

or by calling (800) 222-9323. 

For questions that cannot be resolved with the current 
literature, further technical support can be obtained by 
writing or calling: 

29K Support Products Engineering 

Mail Stop 561 

5900 E. Ben White Blvd. 

Austin, TX 78741 

(800) 2929-AMD (US) 

0-800-89-1131 (UK) 

0-031-11-1129 (Japan) 



THE EFFECTS OF MEMORY 
ORGANIZATION 

The organization of memory determines some of the 
duties that software must perform. The physical charac- 
teristics of the memory design have an impact on the 
system responsibilities in a standalone environment. 
Where the various types of memory are located and how 
they are accessed must be considered. 

While many types of memory organization are possible 
in an Am29000 system, this discussion covers only a 
couple of the more widely known methods. The empha- 
sis is not on describing all of the possibilities, but on 
showing how the duties of the standalone program 
change depending on how the system memory is 
arranged. For nnore information on the advantages and 
disadvantages of various Am29000 memory schemes, 
see the Am29000 Memory Design Handbook. 

MEMORY SPACES 

The Am29000 uses a three-bus Harvard architecture, 
which allows for many different types of memory organi- 
zation. As shown in Figure 1, the Am29000 buses 
include the address bus, the data bus, and the Instruc- 
tion bus. All are 32 bits wide, but only the data bus Is 
bidirectional. The address bus is output-only; the 
instruction bus is input-only. Using the buses and some 
control signals, the Am29000 supports five separate 
memory spaces. The available spaces are register, I/O, 
instruction ROM, coprocessor, and instruction/data 
RAM. 



In any given system, the application program will reside 
and execute in some memory area(s), and will execute 
from some area(s). The areas can be the same, but they 
also can be different. Sometimes, the application 
program will need to be transcribed from one space into 
another before execution. 

When code is transferred from one memory area to 
another, it is usually done so that a higher rate of execu- 
tion can be achieved. Because the Am29000 is very 
fast, it can be limited by the access time of memory. Yet, 
high-speed PROMs are very expensive. Often it is more 
cost-effective to transcribe the code from slow PROMs 
to high-speed RAMs before execution. 

BUS ARCHITECTURE 

Bus architecture Influences how data and Instructions 
are transferred from one memory space to another. The 
Am29000 system in Figure 1 has separate Instruction 
ROM and instruction/data RAM areas. Code could be 
transcribed into the instruction RAM area from the 
instruction ROM using a series of const and consth 
instructions, but a problem would be evident: the 
Am29000 fetches the instructions from the instruction 
bus, regardless of the memory space in which the 
instructions reside. With the system in Figure 1, code 
transcribed to RAM cannot be executed because there 
is no access to the instruction bus. 

One method of resolving this problem is to establish a 
path between the data bus and the instruction bus. Such 
a path can be provided through a swap buffer, as shown 
in Figure 2. The swap buffer is bidirectional, which al- 
lows data or instructions on one bus to be moved to the 
other. 

A different solution is used on AMD's PC Execution 
Board (PCEB29ICM), where fixed storage for data and 
programs is on the host PC. When code is to be run, it is 
loaded into video DRAM (VDRAM) installed in the 
instruction/data RAM space. The dual-ported VDRAM 
has its shifter output connected to the Am29000 instruc- 
tion bus and its data bus connected to the Am29000 
data bus. In this way, the same physical address space 
exists on both buses, and data can be read or written via 
either the instruction bus or the data bus (see Figure 3). 
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Am29000 CALLING CONVENTIONS 

To enhance code readability and accuracy, the 
Am29000 run-time model convention is used. This con- 
vention defines standards for register declarations, pa- 
rameters passing, spill and fill routines, and other topics. 

There are many good reasons for using the Am29000 
run-time model. First, it allows assembly-language 
programs to interface with C programs compiled by 
the HighC29KT" compiler. Second, it makes programs 
easier to understand, particularly for other developers 
making modifications or complementary products. 
Third, it has been tested thoroughly in many different 
environments. Using it from the start will likely save time 
later in the development process. 

This section is a summary of the Am29000 run-time 
model. Because the code samples in the "Writing the 



Start-up Program" section follow the convention estab- 
lished by the run-time model, understanding it will make 
the code samples clearer. See also the Am29000 
Streamlined Instruction Processor User's Manual. 



DECLARATIONS 

A file containing the declarations outlined in the conven- 
tion normally is called into each module that uses the 
definitions. A declarations file can be called into an 
assembly-language source file by inserting a statement 
(usually at or near the top of the file) like: 

.include "romdcl.h" 

In this example, a declarations file named romdcl.h 
would be used with the program. For convenience, the 
declarations required to understand the code sections in 
this document are summarized in Table 1. 
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THE Am29000 RUN-TIME STORAGE 
ORGANIZATION 

In a high-level language that supports nested function 
calls (such as C), specific Information related to each 
function Invocation often is stored on a run-time stack. 
The Am29000 run-time stack Is actually two stacks. One 
is the register stack; the other is the memory stack. 

Both stacks start at an arbitrary high address in memory 
and grow downward as function calls nest deeper. The 
"bottom" of the stack is the high address where the stack 
starts; the 'lop" of the stack is where the last stack item 
was placed, or the address of the lowest valid location. 

Table 1. Summary of Am29000 Register Names 



Protected Special Purpose Register Names 


vab 





Vector Area Base Address 


ops 


1 


Old Processor Status 


cps 


2 


Current Processor Status 


cfg 


3 


Configuration Register 


cha 


4 


Channel Address 


chd 


5 


Cliannel Data 


chc 


6 


Channel Control 


rbp 


7 


Register Bank Protect 


tmc 


8 


Timer Counter 


tmr 


9 


Timer Reload 


pcO 


10 


Program Counter 


pc1 


11 


Program Counter 1 


pc2 


12 


Program Counter 2 


mmu 


13 


MMU Configuration 


Iru 


14 


LRU Recommendation 


Unprotected Special Purpose Register Names 


ipc 


128 


Indirect Pointer C 


ipa 


129 


Indirect Pointer A 


ipb 


130 


Indirect Pointer B 


q 


131 


q 


alu 


132 


ALU Status 


bp 


133 


Byte Pointer 


fc 


134 


Funnel Shift Count 


cr 


135 


Load/Store Count Remaining 



The register stack contains dynamically allocated infor- 
mation pertaining to the local state of a given function 
call, such as Incoming arguments, local variables, and 
outgoing arguments being passed to another function. 
These function-specific data are organized into a series 
of overlapping structures called activation records or 
stack frames. A function is active when invoked, and 
each active function has an activation record some- 
where on the register stack. When a function is entered, 
a new activation record, or register stack frame, is 
created; when the function is exited, its activation record 
is removed. An activation record is shown in Figure 4. 

An important characteristic of activation records is that, 
because the outgoing arguments of a calling function 
("caller") are the incoming arguments of the called func- 
tion ("callee"), thecailee's stack frame overlaps with the 
caller's stack frame. Consequently, except for the first 
activation record on the stack, the incoming arguments 
of the callee are identical to the outgoing arguments 
from the caller for each nested function. Figure 5 shows 
how activation records overlap on the register stack. 

Because the Am29000 has a large, pointer-addressable 
internal local registerfile, it is possible to cache a portion 
of the register stack in local registers (see Figure 6). 
Where the next byte is placed is determined by rsp (the 
register stack pointer). The global register GR1 is 
assigned as the rsp because it can point to the current 
stack position in external memory, while bits 2-9 identify 
the current IrO. Activation records are allocated by 
subtracting the size of the frame needed from rsp, thus 
allocating a new block of local registers unique to this 
function invocation. 
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Figure 4. An Activation Record 



Caching the register stack Introduces the operations 
described below: 

Spill. The portion of the register stack cached in local 
registers cannot exceed 128; if it does, the oldest argu- 
ments are spilled to external memory. A spill occurs 
when rsp becomes less than rab (the register allocate 
bound). 

Prologue. A prologue routine is an assembly-language 
macro that, given the number of incoming arguments. 



outgoing arguments, and local arguments, will allocate 
a register stack frame for the function. 

Epilogue. An epilogue routine Is an assembly-language 
macro that deallocates the register stack frame and 
causes a jump to the return address. 

Fill. When control is being returned to calling functions, a 
previously spilled activation record may not exist in the 
local register file. Then the register file needs to be filled 
from the register stack in external memory. A fill occurs 
when rsp is higher than rfb{\he register free bound). 
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WRITING THE START-UP PROGRAM 

System initialization is one of tiie most critical duties 
performed by software in the standalone system. 
Devices must be configured, memory set up, and traps 
and vectors defined. In short, an execution environment 
must be prepared for the application program. If this Is 
not done properly, the main application program will not 
function properly, and could contain difficuit-to-find 
errors. So careful attention must be given to the routines 
that initialize the system. 

This section discusses writing an assembly-language 
module that will establish the execution environment for 
a C application program. To demonstrate this, an exam- 
ple program is developed in a step-by-step fashion. 

The example application is designed to run on an 
Am29000 system similar to the system shown in 
Figure 7. The system provides a generic Am29000 envi- 
ronment with instruction/data RAM (VDRAIVI), instruc- 
tion ROM, and a dual-port 8530 serial communications 
controller (SCC). The dual-port VDRAM allows instruc- 
tions to be read from RAM. 

The example program consists of three assembly- 
language modules and a declarations file. The assem- 
bly-language module START.S (listed in Appendix B) is 
startup code that establishes the environment for a 
C-ianguage program. The assembly-language module 
BOOT.S (listed in Appendix A) transfers the START.S 
and the C-language application code to RAM, as shown 
by the biacl< arrows in Figure 7. BOOT.S then passes 
control to START.S. The final assembly-language pro- 
gram is TEST.S (listed in Appendix C). TEST.S simu- 



lates a G-language application and tests whether the 
startup has been properly performed. The declarations 
file (ROMDCL.H) and the linker command file 
(TEST.LD) are listed in Appendices D and E, respec- 
tively. 

MAKING A BOOT.S MODULE TO 
TRANSCRIBE CODE 

BOOT.S receives control first. It establishes serial 
communications, tests RAM, and transcribes the appli- 
cation code Into RAM. The sequence performed by 
BOOT.S Is: 

1 . Configure the Am29000. 

2. Establish a register stacl< frame. 

3. Initialize serial I/O for error reporting. 

4. Test RAM. 

5. Set pointers to invalid trap handier. 

6. Call RAMInit (made by ROMCOFF) to transcribe 
code. 

7. Transfer control to START.S. 

Step 1— Configuring the Am29000 

BOOT.S first configures the Am29000's current proces- 
sor status register {cps) to a known state by executing 
the instruction: 

mtsrim cps, 0x173 ;RE,PD, PI, SM, DI,DA 

This Instruction enables instruction fetching from ROM 
(RE = 1), sets address translation for data and instruc- 
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Figure 7. Example Am29000 System 
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tions off (PD.PI = 1 ), turns on supervisor mode (SM = 1 ), 
and disables all interrupts and traps (DI,DA= 1). 

Step 2— Establishing a Simple Register Stack 
Frame 

BOOT.S calls several procedures, so it establishes a 
Register Stack Frame. However, control will not return 
to BOOT.S after calling _main. Therefore, it only needs 
to use a limited stack frame. The frame is set up with: 



const 


rfb, 


512 




frame 








const 


rab, 







sub 


rsp, 


rfb, 


16 


pl 








add 


in, 


rfb, 






;set up temp reg 



r enough for pO and 



Step 3— Initializing I/O Devices 

An I/O device is initialized early, so that it can be used to 
transmit error messages. The 8530 serial communica- 
tions controller is initialized using the routine shown in 
Listing 1 . 



Listing 1. Initializing I/O Devices 



Serlnit : 










.reg 


SI_CtAd, %%(TEMP_REG + 0) 




.reg 


SI_CtVl, %%(TEMP_REG + 1) 




const 


SI_CtAd, SCCCntlAd 




consth 


SI_CtAd, SCCCntlAd 




const 


SI_CtVl, 9 






store 


0, 0, SI_CtVl, 


SI_CtAd 




const 


SI_CtVl, OxcO 






store 


0, 0, SI_CtVl, 


SI_CtAd 




const 


SI_CtVl, 4 






store 


0, 0, SI_CtVl, 


SI_CtAd 




const 


SI_CtVl, 0x44 






store 


0, 0, SI_CtVl, 


SI_CtAd 




const 


SI_CtVl, 3 






store 


0, 0, SI_CtVl, 


SI_CtAd 




const 


SI_CtVl, OxcO 






store 


0, 0, SI_CtVl, 


SI_CtAd 




const 


SI_CtVl, 5 






store 


0, 0, SI_CtVl, 


SI_CtAd 




const 


SI_CtVl, 0x60 






store 


0, 0, SI_CtVl, 


SI_CtAd 




const 


SI_CtVl, 9 






store 


0, 0, SI_CtVl, 


SI_CtAd 




const 


SI_CtVl, 0x0 






store 


0, 0, SI_CtVl, 


SI_CtAd 




const 


SI_CtVl, 10 






store 


0, 0, SI_CtVl, 


SI_CtAd 




const 


SI_CtVl, 0x0 






store 


0, 0, SI_CtVl, 


SI_CtAd 




const 


SI_CtVl, 11 






store 


0, 0, SI_CtVl, 


SI_CtAd 




const 


SI_CtVl, 0x56 






store 


0, 0, SI_CtVl, 


SI_CtAd 




const 


SI_CtVl, 12 






store 


0, 0, SI_CtVl, 


SI_CtAd 




const 


SI_CtVl, 0x6 






store 


0, 0, SI_CtVl, 


SI_CtAd 



•control port address 
: control port value 



: reset the port 



rxl6, 1 stop, no parity 



'8 bits receive 



'8 bits xmit 



Int. disabled 



rTx & Rx BRG out 



9600 baud 
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Listing 1. Initializing I/O Devices (continued) 

;9600 baud 



const 


SI CtVl, 13 






store 


0, 0, SI CtVl, 


SI 


CtAd 


const 


SI CtVl, 0x0 






store 


0, 0, SI CtVl, 


SI 


CtAd 


const 


SI CtVl, 14 






store 


0, 0, SI CtVl, 


SI 


CtAd 


const 


SI_CtVl, 0x0 






store 


0, 0, SI CtVl, 


SI 


CtAd 


const 


SI_CtVl, 14 






store 


0, 0, SI CtVl, 


SI 


CtAd 


const 


SI CtVl, 0x1 






store 


0, 0, SI_CtVl, 


SI 


CtAd 


const 


SI CtVl, 3 






store 


0, 0, SI_CtVl, 


SI 


CtAd 


const 


SI_CtVl, Oxcl 






store 


0, 0, SI_CtVl, 


SI 


CtAd 


const 


SI CtVl, 5 






store 


0, 0, SI_CtVl, 


SI 


CtAd 


const 


SI_CtVl, Oxea 






store 


0, 0, SI CtVl, 


SI 


CtAd 


EPILOGUE 









;BRG in RTxC 



;BRG on 



;Rx enable 



;Tx enable 



Step 4— Testing RAM 

The RAM is tested before code is transferred to it. 
BOOT.S calls a single test, an address pattern test. 
Other tests are Included in the source listing shown In 
Appendix A. The test used by BOOT.S is shown in 
Listing 2. 



Step 5— Setting the Vector Table Entries to the 
Invalid Trap Handler 

START.S will set up the vector table, but BOOT.S 
guards against abnormal ends by making all of the 
vector table entries point to an invalid trap handler in 
ROM. This Is done with the following routine, which Is 
called from the main loop, as shown in Listing 3. 



Listing 2. Testing RAM , 

.sbttl "RAM Address Pattern Test" 
FUNCTION RAMAddr, 2, 0, 3 

This routine will run a two-pass test on RAM. It will be controlled by input values 
specifying the base address and the count of locations 1;o be tested. In the first 
pass, the data will be set equal to the address. In the second pass, the data 
will be set equal to the complement of the address. 



n: 


(see below) 


ut: 


(see below) 


reg 


RA StrtAdd, 


reg 


RA_WrdCnt, 


reg 


RA_TmpCnt , 


reg 


RA_StrtPat, 


reg 


RA PtrnInc, 


reg 


RA_NxtAdd, 


reg 


RA_WrtPat, 


reg 


RA RedPat, 



%%(IN_PRM + 0) 
%%(IN_PRM + 1) 
%%(TEMP_REG + 0) 

%%(TEMP_REG + 1) 

%% (TEMP_REG 
%% (0UT_PRM + 0) 
%%(OUT_PRM + 1) 
%% (OUT PRM + 2) 



2) 



; starting address 
; count of words 
; total test word count 
; starting pattern 
;ptrn increment value 
; error address 
; pattern written 
/pattern read 



3-116 



Programming Standalone Am29000 Systems 



Listing 2. Testing RAM (continued) 



.reg 

add 

const 



RA_Fail, %%(RET_VAL + 0) 
RA_StrtPat, RA_StrtAdd, 
RA Ptrninc, 4 



;TRUE for fail 

; start with address 



RA_1 : ;fill memory with pattern 

add RA_NxtAdd, RA_StrtAdd, 
sub RA_TmpCnt, RA_WrdCnt, 2 
add RA WrtPat, RA StrtPat, 



;get start address 

;for jmpfdec 

; set the pattern 



RA 2; 



store 0, 0, RA_WrtPat, RA_NxtAdd 

add RA_WrtPat, RA_WrtPat, RA_PtrnInc 

jmpfdec RA_TmpCnt, RA_2 

add RA_NxtAdd, RA_NxtAdd, 4 

; check memory for pattern 

add RA_NxtAdd, RA_StrtAdd, 

sub RA_TmpCnt, RA_WrdCnt, 2 . 

add RA WrtPat, RA StrtPat, 



;next test mem addr 

;get start address 

;for jmpfdec 

; set the pattern 



RA 3: 



CD, DATA_CTL, RA_RedPat, RA_NxtAdd 
RA_Fail, RA_RedPat, RA_WrtPat 
RA Fail, RA ERR 



RA_WrtPat, RA_WrtPat, RA_PtrnInc 

RA_TmpCnt, RA_3 

RA NxtAdd, RA NxtAdd, 4 



load 

cpneq 

jmpt 

nop 

add 

jmpfdec 

add 

; invert ptrn for next pass 



RA ERR: 



nor 

cpneq 

jmpt 

subr 

jmp 

nop 



call 
nop 
const 
consth 



RA_StrtPat, RA_Sli:rtPat, 
RA_Fail, RA_StrtPat, RA_StrtAdd 
RA_Fail, RA_1 
RA_PtrnInc, RA_PtrnInc, 
RA EXIT 



IrO, RAMErr 

RA_Fail, TRUE 
RA Fail, TRUE 



;err if neg 

;next test mem address 
/invert initial 

/negate inc value 



;set after call 



RA EXIT: 



EPILOGUE 
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Listing 3. Setting Vector Table Entries 



.sbttl 
LEAF 



"Vector Initialization" 
Vectlnit, 



This routine initializes the vector table and vab. All vectors 
are set to point to the invalid trap handler in ROM. 



.reg VI_Vect, %% (TEMP_REG + ,0) 

.reg Vl_VectSt, %% (TEMP_REG + 1) 

.reg VI_VectCnt, %% (TEMP_REG + 2) 

mtsrim vab, 

mfsr VI_VectSt, vab 

const VI_Vect, (InvalidTrapHandler | 2) 

consth VI_Vect, InvalidTrapHandler 

const VI VectCnt, (256 - 2) 



; vector value 

/vector storage address 

; vector count register 



;for jmpfdec 



VI_Loop: 



store 
jmpfdec 
add 
EPILOGUE 



0, 0, VI_VectSt, VI_Vect 

VI_VectCnt, VI_Loop 

VI VectSt, VI VectSt, 4 



; store the vector 



Step 5— Transcribing Code to RAM 

BOOT.S transcribes START.S and the C-language 
application (simulated by TEST.S) into instruction/data 
RAM by calling RAMInit. 

RAMInit is a routine that is created by the ROMCOFF 
utility. When an executable Ann29000 object file is sub- 
mitted to ROMCOFF, the utility generates a relocatable 
object file of type RI_Text that (when called) establishes 
an image of the executable module in instruction/data 
RAM. BOOT.S transfers START.S and the C-language 
application to RAM by calling the RAMInit routine cre- 
ated by ROMCOFF. 

RAMInit Is called by: 

call RI_Ret, RAMInit /initialize RAM 

Note that when RAMInit is called, the return address is 
not stored In a local register (such as IrO), and that 
RAMInit is called just before transferring control to 



_main. To transcribe data to RAM, RAMInit will create a 
stream of const and consth instructions that will load up 
the local registers starting from IrO. Then it will insert a 
store multiple command to transfer the data into mem- 
ory. Consequently, any data in local registers will be 
ovenwritten. 

Step 7— Calling START.S 

As BOOT.S does not intend to have control returned to 
it, it calls START.S by simulating a return from interrupt. 
This is accomplished by setting the freeze (FRZ) bit ON 
in the old processor status {ops) and current processor 
status registers {cps), putting the starting address of 
START.S in PCO, and performing a return from interrupt 
(see Listing 4). 

The Main Loop of BOOT.S 

When all of the preceding steps are put together, the 
main loop appears as shown in Listing 5. 



mtsrim 


ops, 


0x473 


mtsrim 


cps. 


0x473 


const 


IrO, 


TextBas 


consth 


IrO, 


TextBas 


mtsr 


pel, 


IrO 


add 


IrO, 


IrO, 4 


mtsr 


pcO, 


IrO 


iretinv 







Listing 4. Calling START.S 



;FZ, PD, PI, SM, DI, DA 
;FZ, PD, PI, SM, DI, DA 
; (using IrO as temp) 



rgo to inst space, TextBas 
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Listing 5. Main Loop of BOOT.S 



.reg 


RI_Ret, %%(TEMP_REG + 0) 


mtsrim 


cps, 


0x173 




const 


rfb. 


512 




const 


rab. 







sub 


rsp. 


rfb, 16 




add 


.in. 


rfb, 




call 


IrO, 


Serlnit 




nop 








const 


pl. 


(RAM SIZE » 2) 




consth 


pl. 


(RAM SIZE » 2) 




call 


IrO, 


RAMAddr 




const 


pO, 







call 


IrO, 


Vectlnit 




nop 








call 


RI Ret, RAMInit 




mtsrim 


ops. 


0x473 




mtsrim 


cps. 


0x473 




const 


IrO, 


TextBas 




consth 


IrO, 


TextBas 




mtsr 


pel, 


IrO 




add 


IrO, 


IrO, 4 




mtsr 


pcO, 


IrO 





; RAMInit return 

;RE, PD, PI, SM, DI, DA 

; set up temp reg frame 

; enough for pO and pi 

; initialize an 8530 to report errors 

;test full RAM size 

;call a RAM address test 

;test from addr (input parm) to RAM test 

;to RAM test 

/routine to initialize traps to 

/invalid trap handler 

/initialize RAM — from ROMCOFF 

;FZ, PD, PI, SM, DI, DA 

;FZ, PD, PI, SM, DI, DA 

; (using IrO as temp) 



CREATING THE EXECUTION ENVIRONMENT 
WITH START.S 

The START.S file is used to prepare the execution 
environment for the application program (simulated by 
TEST.S). Although a given application certainly will 
have varied requirements in different hardware environ- 
ments, the tasks that will be performed by START.S are 
needed to establish virtually any operating environment 
on the Am29000. These are: 

1. Configure the Am29000. 

2. Allocate the register and memory stacks. 

3. Initialize vector table and trap handlers. 

4. Initialize the TLB by marking all entries invalid. 

5. Call "main." 

Step 1— Configuring the Am2g000 

Code similar to that shown below can be used to set the 
contents of the cfg so that the vector area is a table of 
pointers (VF = 1) and the Branch Target Cache^" is 
disabled (CD = 1). Also, the cps register is set so that 
physical addressing is used for both instructions and 
data (PD = 1 ,PI = 1), all interrupts and traps are disabled 
(DI = 1 ), and supervisor mode is ON (SM = 1). The timer 
(tmr) is also set to to avoid unwanted timer interrupts: 

mtsrim tmr, 
mtsrim cfg, (VFjCD) 
mtsrim cps, (PD|PI|SM|DI) 



The setting of the VF bit has determined the structure of 
the vector area table. The vector area is a user- 
managed table in external instruction/data memory that 
starts at the address held in the vector area base (VAB) 
register. The vector area can have one of two different 
structures, as determined by the VF bit of the configura- 
tion register. 

If VF= 1, then the vector area is organized as a list of 
256 pointers to interrupt/trap handlers. If VF = 0,then the 
vector area is arranged as 256 64-instruction blocks, 
each corresponding to a given call. Each fixed block 
then contains the corresponding interrupt or trap 
handler. Figure 8 shows the two structures. 

When the Am29000 receives an interrupt or trap, the 
location of the appropriate handler is determined by the 
vector area (VA). Each interrupt and trap has a vector 
number between and 255 that corresponds to an entry 
in the vector area. Of the vector numbers, to 63 are 
reserved for system and floating-point operations. The 
assigned vector numbers are given in the Am29000 
User's Manual. 

If the table Is a list of pointers, control will be passed to 
the address at VAB -i- (vector number * 4). Multiplication 
by 4 adjusts the vector number to words. If the vector 
table is composed of handlers, control will be passed to 
a handler starting at VAB + (vector number * 64 * 4), 
where the vector number is adjusted to words and multi- 
plied by the number of instructions per block (fixed) (see 
Table 2). 
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Table 2. The Location of a Pointer in the VAT 
CFG:VF ISR Address= 



VAB + (vector number * 4) 
VAB + (vector number * 256) 



Step 2— Allocating Register and Memory Stack 
Frames 

A full register stack frame is established by START.S, 
because it will call the application program (_main). 
Further, control could be passed back to the START.S 
return address (which then initiates a "warm start"). This 



should be done early in the main loop, as START.S will 
call some supporting assembly-language routines. The 
register stack frame can be established by the code 
shown in Listing 6. 

Arguments that overflow the register stack will have to 
be placed in the memory stack (see Figure 8). The 
current position in the memory stack is pointed to by the 
memory stack pointer {msp). 

The stack can be established by: 

const msp, .MStkTop 
consth msp, MStkTop 



Listing 6. Allocating Register and Memory Stack Frames 

const rfb, RStkTop ;RStkTop is set to the 

consth rfb, RStkTop /desired address in the declarations file 

const rab, (RStkTop - 512) ;128*4, maximum 

consth rab, (RStkTop - 512) ;part that can 

add Irl, rfb, ;be cached 

sub rsp, rfb, 16 /adjusts for IrO, Irl, argc, and argv 



VAB 

+ 
(Vector Number * 256) 



Handler 



Handler 



Handler 



VAB 



Handler 



VAB 

+ 
(Vector Number * 4) 



CFG:VF=0 



CFG:VF=1 



11025A-08 



Figure 8. The Two Structures of the Vector Area 
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Step 3— Initializing the Vector Area and Vectors 

Although the organization of the vector area is deter- 
mined by the configuration register, the table and point- 
ers still must be initialized. In the following example, the 
vector initialization code is kept compact, while permit- 
ting easy expansion of the vector set, by using a table in 



the .data section. Each entry in the table has two words. 
The first is the vector number; the second is the handler 
address (see Listing 7). 

When the vector area base (vab) is supplied to the 
routine shown in Listing 8, it initializes the handlers. 



Listing 7. initializing the Vector Area and Vectors 



/switch to .data for table 



VectlnitTable: 

• word 

• word 
.word 
.word 
.word 
.word 
.word 
.word 
.word 
.equ 
.text 



V_SupInstTLB, SupInstTLBHandler 

V_SupDataTLB, SupDataTLBHandler 

V_MULTIPLY, MultiplyHandler 

V_DIVIDE, DivideHandler 

V_MULTIPLU, MultipluHandler 

V_DIVIDU, DividuHandler 

V_SPILL, SpillHandler 

V_FILL, FillHandler 

V_Timer, TimerHandler 

VINIT CNT, ((. - VectlnitTable) / 8) 



/•switch back to .text for code 



Listing 8. Initializing Vector Handlers 



Vectlnit: 










.reg 


VI_Vect,%%(TMP_REG + 0) 


; vector value 




.reg 


VI_St,%%(TMP_REG + 1) 


; vector storage address 




.reg 


VI_Cnt,%% (TMP_REG + 2) 


/•vector count 




.reg 


VI_Base,%%(TMP_REG + 3) 


/vector base 




.reg 


VI_TbPt,%%(TMP_REG + 4) 


/vector base 




mfsr 


VI_Base, vab 






const 


VI_Cnt, (VINIT_CNT - 2) 


;for jmpfdec 




const 


VI_TbPt, VectlnitTable 






consth 


VI_TbPt, VectlnitTable 




VI_Loop: 










load 


0, 0, VI_St, VI_TbPt 


;get the vector 




add 


VI_TbPt, VI_TbPt, 4 






sll 


VI_St, VI_St, 2 


/convert to address 




add 


VI_St, VI_St, VI_Base 






load 


0, 0, VI_Vect, VI_TbPt 


/get the handler 




add 


VI_TbPt, VI_TbPt, 4 






jmpfdec 


VI_Cnt, VI_Loop 






store 


0, 0, VI_Vect, VI_St 






jiiip 


raddr 






nop 
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Step 4— initializing the Translation Look-Aside 
Buffer (TLB) 

When the Am29000 is first powered-up, the TLB will not 
have valid entries. To prevent erroneous TLB misses, 
the entries should be marl<ed invalid by the start-up 
sequence before control is passed to the application 
program. This can be done with an assembly-language 
sequence (see Listing 9). 

Step 5— Calling "main" 

Once the proper environment has been established for 
the application program, the main C program must be 
called. This is done by placing the address of the starting 
instruction in registers and performing a call. When the 
jump is "short," or less than 256 words, a call can be 
done directly. However, the jump often.will be farther, 
and call! must be used in conjunction with an address 
stored in registers, as shown below: 

const raddr, _main /store lower 16 bits 
consth raddr, _main ; store upper 16 bits 
calli raddr, raddr ;call indirect 



Notice that raddr signifies the return address, usually 
IrO, by convention. Once the call is made, the return 
address of the caller has replaced the target location, in 
the event there is a return from _main. 

The START.S Main Loop 

The complete START.S main loop, as developed in the 
previous sections, is shown in Listing 10. The routine 
receives control after being transcribed to RAM; once 
there, it initializes the vector handlers, clears the BSS 
area, initializes the TLBs, and establishes initial stack 
pointers and an initial register frame. Lastly, it invokes 
_main. Note that, in the event _main returns, a warm 
start is performed. 





-reg 


TI_Reg, 


%% (TEMP REG 


+ 0) 




• reg 


TI Val, 


%% (TEMP REG 


+ 1) 




. reg 


TI Cnt, 


%% (TEMP REG 


+ 2) 




const 


TI_Reg, 









const 


TI Val, 









const 


TI_Cnt, 


(TLB_CNT - 


2) 


TI_Loop: 












mttlb 


TI_Reg, 


TI Val 






jmpf dec 


TI Cnt, 


TI_Loop 






add 


TI_Reg, 


TI_Reg, 1 





Listing 9. Initializing the TLB 

;the TLB register number 

;the TLB value (0) 

;the TLB register count 



; for jmpfdec 
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Listing 10. START.S Main Loop 



mtsrim 


cps. 


0x73 


mtsrim 


mmu. 


MMU_PS 


mtsrim 


cfg. 


0x10 


const 


rfb. 


RStkTop 


consth 


rfb. 


RStkTop 


const 


rab. 


(RStkTop - 512) 


consth 


rab. 


(RStkTop - 512) 


add 


in. 


rfb, 


sub 


rsp. 


rfb, 16 


argv 






const 


msp. 


MStkTop 


consth 


msp. 


MStkTop 


call 


IrO, 


Vectlnit 


vectors 






nop 






call 


IrO, 


TLBInit 


nop 






mtsrim 


cps. 


0x10 


const 


lr2. 





const 


lr3. 





call 


IrO, 


_main 


nop 






mtsrim 


cps. 


0x473 


mtsrim 


ops. 


0x173 


mtsrim 


cfg. 


1 


mtsrim 


chc, 





mtsrim 


pel. 





mtsrim 


pcO, 


4 


iretinv 







;set PD, PI, SM, DI, DA 

;PID = 

;VF 

; set up stack pointers 



;make room for IrO, Irl, argc. 



/routine to install handled 



r routine to mark TLBs invalid 



rSM 

rargc = 
rargv = 



:set FZ, PD, PI, SM, DI, DA 
;set RE, PD, PI, SM, DI, DA 
r cache disabled 
: contents invalid 
rcold start address 
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APPENDIX A: boots 

.title "ROM Boot Code" 

Copyright 1988, Advanced Micro Devices 
Written by Gibbons and Associates, Inc. 

This module is intended to receive control at address 0. It handles a hardware 
reset or a simulation of that event in a "warm start" situation. 

Its purpose is to provide sufficient initializations for the operation of a program 
in RAM data/instruction space. The initializations must include the transcription 
of the program and its initialized data. The code and initialized data are stored 
in ROM prior to transcription. 

To provide for orderly operation, C linkages are used. It is known that the register 
stack will never overflow. When certain calamities occur (e.g., invalid 
traps) , the registers will be re-initialized to allow the use of subroutines in 
this module. There is no intention of ever returning under these circumstances. 

Some of the routines in this module have a rather tedious implementation because 
they do not assume the validity of RAM or the readability of ROM. This is 
considered appropriate since it assures the validity of error handling. 

This module provides no global addresses for external use. It is not intended to 
be called. It is best thought of as bootstrap code. 

Some tests which are not actually used are included here for use in environments 
that may allow them. 

The external addresses named below are required. 

.extern RAMInit ; romcof f generated 



This module needs the addresses for the control and data ports of the SCC. These 
are declared below. 

.equ SCCCntlAd,OxfffffffO /control port address 

.equ SCCDataAd,0xfffffff4 /data port address 



This module assumes that RAM begins at data address and has the size declared 
below. 

.equ RAM_SIZE, 0x40000 ;256K bytes 

.include "romdcl.h" 

.eject 

.sbttl "Section Declarations" 



This module has only one section, which is called "rom." It receives control at 
reset, i.e., it is an absolute segment based at address (in ROM space). 

.sect rom, text, absolute 
.use rom 
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RomBase: 








jmp 


Boot 




nop 






nop 






nop 






halt 


;the warn entry 




nop 


; Could be a report ro 




.eject 






.sbttl 


"SCC Routines" 




LEAF 


Serlnit,0 



;the RESET entry 



This routine initializes the serial port for non-interrupt driven access at 9600 
baud. 



; control port address 
/control port value 



In: 


(nothing) 




Out: 


(nothing) 






.reg 


S I_Ct Ad , % % ( TEMP_REG + ) 




.reg 


SI_CtVl,%%(TEMP_REG + 1) 




const 


SI_CtAd, SCCCntlAd 




consth 


SI_CtAd, SCCCntlAd 




const 


SI_CtVl,9 




store 


0,0, SI_CtVl , SI_Ct Ad 




const 


SI_CtVl,OxcO 




store 


0,0, SI_CtVl , SI_CtAd 




const 


SI_CtVl,4 




store 


, , S I_CtVl , S I_Ct Ad 




const 


SI_CtVl,0x4 4 




store 


0,0, SI_CtVl , SI_CtAd 




const 


SI_CtVl,3 




store 


, , S I_Ct VI , S I_Ct Ad 




const 


SI_CtVl,OxcO 




store 


, , S I_Ct VI , S I_Ct Ad 




const 


SI_CtVl,5 




store 


, , S I_CtVl , S I_Ct Ad 




const 


SI_CtVl,0x60 




store 


, , S I_CtVl , S I_Ct Ad 




const 


SI_CtVl,9 




store 


, , S I_Ct VI , S I_Ct Ad 




const 


SI_CtVl,0x0 




store 


0,0, SI_CtVl , SI_Ct Ad 




const 


SI_CtVl,10 




store 


, , S I_CtVl , S I_Ct Ad 




const 


SI_CtVl,OxO 




store 


, , S I_CtVl , S I_Ct Ad 




const 


SI_CtVl,ll 




store 


0,0, SI_CtVl , SI_Ct Ad 




const 


SI_CtVl,0x56 




store 


, , S I_CtVl , S I_Ct Ad 




const 


SI_CtVl,12 




store 


0,0,SI_CtVl,SI_CtAd 




const 


SI_CtVl,0x6 




store 


, , S I_Ct VI , S I_Ct Ad 




const 


SI_CtVl,13 




store 


0,0, SI_CtVl , SI_CtAd 




const 


SI_CtVl,0x0 




store 


0,0, SI_CtVl , SI_Ct Ad 



; reset the port 



;xl6,l stop, no parity 



; 8 bits receive 



; 8 bits xmit 



;Int. disabled 



;NRZ 



;Tx S Rx ERG out 



;9600 baud 



;9600 baud 
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const 


SI CtVl,14 




store 


0,0, SI CtVl,SI 


CtAd 


const 


SI CtVl,0x0 




store 


0,0, SI CtVl,SI 


CtAd 


const 


SI CtVl,14 




store 


0,0, SI CtVl,SI 


CtAd 


const 


SI CtVl,Oxl 




store 


0,0,SI_CtVl,SI_ 


CtAd 


const 


SI CtVl,3 




store 


0,0, SI CtVl,SI 


CtAd 


const 


SI CtVl,0xcl 




store 


0,0, SI CtVl,SI 


CtAd 


const 


SI CtVl,5 




store 


0,0, SI CtVl,SI 


CtAd 


const 


SI_CtVl,Oxea 




store 


0,0, SI CtVl,SI 


CtAd 


EPILOGUE 






LEAF 


SerXmt,l 





;BRG in RTxC 



;BRG on 



; Rx enable 



; Tx enable 



This routine transmits a single character via the SCC. 
the SCC to become ready. 



It will wait (forever) for 



; In: 


(see below 








; Out: 


(nothing) 










• reg 


SX Char, %% (IN PRM 


+ 


0) 




. reg 


SX Ad, %% (TEMP REG 


+ 


0) 




. reg 


SX VI, %% (TEMP REG 


+ 


1) 




const 


SX Ad,SCCCntlAd 








consth 


SX_Ad, SCCCntlAd 






SX_Wait : 


load 

and 

cpeq 

jmpf 

nop 

const 

consth 

store 

EPILOGUE 


0,0,SX_Vl,SX_Ad 
SX_Vl,SX_Vl,0x4 
SX_V1,SX_V1,0 
SX_Vl,SX_Wait 

SX_Ad,SCCDataAd 
SX_Ad,SCCDataAd 
0,0, SX_Char, SX_Ad 








LEAF 


SerRcv, 







; character 
;port address 
;port value 



; get the status 

; check tx buf empty 



; send the character 



This routine waits for a receive character to become ready, then reads and returns 
that character. 



In: 


(nothing) 


Out: 


(see below) 




. reg 




.reg 




const 




consth 



SR_Ad,%% (TEMP_REG + 0) 
SR_Char,%%(RET_VAL + 0) 
SR_Ad, SCCCntlAd 
SR Ad, SCCCntlAd 



;port address 
/character (stat tmp) 
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SR Wait: 




load 


0,0, SR Char,SR Ad 


and 


SR Char,SR Char, 0x1 


cpeq 


SR_Char, SR_Char, 


jmpf 


SR Char,SR Wait 


nop 




const 


SR Ad,SCCDataAd 


consth 


SR Ad,SCCDataAd 


load 


0,0, SR Char,SR Ad 


and 


SR Char,SR Char, Oxf f 


EPILOGUE 




LEAF 


SerChk,0 



;get the status 

; check rev buf ready 



; fetch the character 



This routine checks to determine if a receive character is ready at the serial 
port. It will return -1 if a character is ready and if it is not. 



rport address 
r character 



In: 


(nothing) 




Out: 


(see below 






.reg 


SC Ad, %% (TEMP REG + 0) 




.reg 


S C_Rdy , % % ( RET_VAL + ) 




const 


SC Ad,SCCCntlAd 




consth 


SC Ad,SCCCntlAd 




load 


0,0, SC Rdy,SC Ad 




and 


SC_Rdy, SC_Rdy, 0x1 




cpeq 


SC_Rdy, SC_Rdy,0 




sra 


SC_Rdy,SC_Rdy,31 




EPILOGUE 





.eject 
. sbttl "Error Message Routines" 

FUNCTION SendErr, 0,0,1 

This routine sends the text "Error 



• reg 


SE Char, %% (OUT PRM 


+ 


call 


lrO,SerXmt 




const 


SE Char, ' E' 




call 


lrO,SerXmt 




const 


SE Char, ' r' 




call 


lrO,SerXmt 




const 


SE Char, ' r' 




call 


lrO,SerXmt 




const 


SE Char, 'o' 




call 


lrO,SerXmt 




const 


SE Char, 'r' 




call 


lrO,SerXmt 




const 


SE_Char, ' ' 




call 


lrO,SerXmt 




const 


SE Char,'-' 




call 


lrO,SerXmt 




const 


SE Char, ' ' 




EPILOGUE 






FUNCTION 


SendNL, 0,0,1 





rget the status 

r check rev buf ready 

r convert to or -1 



/output character 

; send a "E" 

; send a "r" 

; send a "r" 

; send a "o" 

; send a "r" 

; send a " " 

; send a "-" 

; send a " " 
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This routine sends a CR-LF sequence. 



.reg 


SN Char, %% (OUT PRM + 0) 


call 


lrO,SerXmt 


const 


SE Char,OxOd 


call 


lrO,SerXmt 


const 


SE Char,OxOa 


EPILOGUE 




FUNCTION 


SendWord, 1,1,1 



This routine sends a 32-bit word in ASCII hex 

.reg SW_Word, %% (IN_PRM + 0) • 

.reg SW_Shift, %% (LOC_REG + 0) 

.reg SW_T_Flag, %% (TEMP_REG + 0) 

.reg SW_Char, %% (OUT_PRM + 0) 

const SW Shift, 28 



: send a "CR" 
: send a "LF" 



rthe word to send 
: shift factor 

; character to send 
: right shift factor 



SW 1: 



srl 

and 

cplt 

jmpt 

add 

add 



call 

nop 

subs 

cpge 

jmpt 

nop 

EPILOGUE 



SW_Char , SW_Word, SW_Shif t 
SW_Char, SW_Char, Oxf 
SW_T_F1 ag , SW_Cha r , 1 
SW_T_Flag,SW_l 
SW_Char, SW_Char, 0x30 
SW Char,SW Char, 0x27 



lrO,SerXmt 

SW_Shift, SW_Shift, 4 
SW_T_Flag, SW_Shift, 
SW_T_Flag,SW_0 



r isolate nibble 
r check decimal 

r convert to ASCII digit 
•convert to ASCII letter 



r send the character 

rnext digit shift fact 
r check if done 
r continue if not 



FUNCTION RAMErr, 3,0,1 , 

This routine reports RAM errors with the message, 

"Error - RAM at aaaaaaaa write bbbbbbbb read cccccccc\n" 



. reg RE_ErrAdd, %% (IN_PRM +0) 

•reg RE_WrtPat, %% (IN_PRM + 1) 

.reg RE_RedPat , %% (IN_PRM + 2) 

.reg RE_Char, %% (OUT_PRM + 0) 

.reg RE_Word, %% (OUT_PRM + 0) 

call lrO,SendErr 

nop 

call lrO,SerXmt 

const RE_Char,'R' 

call lrO,SerXmt 

const RE_Char,'A' 

call lrO,SerXmt 

const RE_Char,'M' 

call lrO,SerXmt 

const RE_Char, ' ' 

call lrO,SerXmt 

const RE Char, 'A' 



; send "Error 

; send a "R" 
; send a "A" 
; send a "M" 
; send a " " 
; send a "A" 
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call 

const 

call 

const 

call 

add 

call 

const 

call 

const 

call 

const 

call 

const 

call 

const 

call 

const 

call 

const 

call 

add 

call 

const 

call 

const 

call 

const 

call 

const 

call 

const 

call 

const 

call 

add 

call 

nop 

EPILOGUE 



lrO,SerXmt 
RE_Char, 'T' 
lrO,SerXmt 
RE_Char,' ' 
lrO,SendWord 
RE_Word, RE_ErrAdd, 
lrO,SerXmt 
RE_Char, ' ' 
lrO,SerXmt 
RE_Char, ' w' 
lrO,SerXmt 
RE_Char, ' r' 
lrO,SerXmt 
RE_Char, ' i' 
IrO.SerXmt 
RE_Char, ' t' 
lrO,SerXmt 
RE_Char, ' e' 
lrO,SerXmt 
RE_Char,' ' 
lrO,SendWord 
RE_Word, RE_WrtPat , 
IrO.SerXmt 
RE_Char, ' ' 
lrO,SerXmt 
RE_Char,'R' 
lrO,SerXmt 
RE_Char, ' e' 
lrO,SerXmt 
RE_Char, 'a' 
lrO,SerXmt 
RE_Char, ' d' 
lrO,SerXmt 
RE_Char, ' ' 
lrO,SendWord 
RE_Word, RE_RedPat , 
lrO,SendNL 



; send a "T" 



; send a " " 

; send error address 



' send a " " 

• send a "w" 

• send a "r" 
send a "i" 

•send a "t" 

• send a "e" 

• send a " " 

■ send good pattern 

' send a " " 

• send a "R" 

■ send a "e" 

■ send a "a" 

■ send a "d" 

' send a " " 

' send bad pattern 

■ send a new line 



FUNCTION ROMErr, 1,0,1 

This routine reports a ROM sum error with the message, 
"Error - ROM sum aaaaaaaa\n" 



. reg 

.reg 

. reg 

call 

nop 

call 

const 

call 

const 

call 

const 

call 

const 

call 

const 



ROM_Sum,%%(IN_PRM + 0) 
ROM_Char, %% (OUT_PRM + 0) 
ROM_Word, %% (OUT_PRM + 0) 
lrO,SendErr 



IrO, 
ROM_ 
IrO, 
ROM_ 
IrO, 
ROM_ 
IrO, 
ROM_ 
IrO, 
ROM 



SerXmt 
Char, 'R' 
SerXmt 
Char, '0' 
SerXmt 
Char, 'M' 
SerXmt 
Char, ' ' 
SerXmt 
Char, ' s' 



; send "Error 

; send a "R" 
; send a "0" 
; send a "M" 
; send a " " 
; send a "s" 
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call 

const 

call 

const 

call 

const 

call 

const 

call 

const 

call 

add 

call 

nop 

EPILOGUE 



lrO,SerXmt 
ROM_Char, 'u' 
lrO,SerXmt 
ROM_Char, 'm' 
lrO,SerXmt 
ROM_Char, ' ' 
lrO,SerXmt 
ROM_Char, '=' 
lrO,SerXmt 
ROM_Char, ' ' 
lrO,SendWord 
ROM_Word, ROM_Sum, 
lrO,SendNL 



; send a "u" 
; send a "m" 
; send a ' 
; send a "=" 
; send a ' 



; send ROM check sum 
; send a new line 



FUNCTION SizeErr, 0,0,1 

This routine reports insufficient RAM size with the message 
"Error - RAM size\n" 



. reg 


SIZ_Char,%% (0 


JT_PRM + 


call 


lrO,SendErr 




nop 






call 


lrO,SerXmt 




const 


SIZ_Char, 'R' 




call 


lrO,SerXmt 




const 


SIZ_Char, 'A' 




call 


lrO,SerXmt 




const 


SIZ_Char, 'M' 




call 


lrO,SerXmt 




const 


SIZ_Char, ' ' 




call 


lrO,SerXmt 




const 


SIZ_Char, 's' 




call 


lrO,SerXmt 




const 


SIZ_Char, ' i' 




call 


lrO,SerXmt 




const 


SIZ_Char, 'z' 




call 


lrO,SerXmt 




const 


SIZ_Char, 'e' 




call 


lrO,SendNL 




nop 






EPILOGUE 







• send "Error - " 

: send a "R" 

• send a "A" 
: send a "M" 

■ send a " " 
■send a "s" 
•send a "i" 
■send a "z" 

•send a "e" 

■ send a new line 



FUNCTION TrapErr, 0,0,1 

This routine reports insufficient RAM size with the message 
"Error - Invalid trap\n" 



.reg 


TE_Char,%%( 


call 


lrO,SendErr 


nop 




call 


lrO,SerXmt 


const 


TE Char, 'I' 


call 


lrO,SerXmt 


const 


TE Char, 'n' 


call 


lrO,SerXmt 


const 


TE Char, 'v' 


call 


lrO,SerXmt 



; send "Error - 

;send a "I" 

; send a "n" 

; send a "v" 
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const 


TE_Char, 'a' 




call 


lrO,SerXmt 




const 


TE_Char,'l' 




call 


lrO,SerXmt 




const 


TE_Char,'i' 




call 


lrO,SerXmt 




const 


TE_Char,'ci' 




call 


lrO,SerXmt 




const 


TE_Char, ' ' 




call 


lrO,SerXmt 




const 


TE_Char, 't' 




call 


lrO,SerXmt 




const 


TE_Char, ' r' 




call 


lrO,SerXmt 




const 


TE_Char, 'a' 




call 


lrO,SerXmt 




const 


TE_Char, 'p' 




call 


IrO.SendNL 




nop 






EPILOGUE 






.eject 






.sbttl 


"ROM Checksum 


Test" 



; send a "a" 

; send a "1" 

; send a "i" 

; send a "d" 

; send a " " 

; send a "t" 

; send a "r" 

; send a "a" 

; send a "p" 

; send a new line 



FUNCTI ON ROMSum, 2,0,1 



This routine is used to ensure that the ROM is "intacted" 
the checksum checking method. 



correctly by using 



(see below) 



Out: 



(see below) 



.reg RS_StrtAdd, %% (IN_PRM + 0) 

.reg RS_WrdCnt, %% (IN_PRM + 1) 

.reg RS_SumTmp, %% (TEMP_REG + 0) 

.reg RS_ChkSum, %% (OUT_PRM + 0) 

.reg RS_Fail, %% (RET_VAL + 0) 

xor RS_ChkSum, RS_ChkSum, RS_ChkSum 

sub RS WrdCnt,RS WrdCnt,2 



r start address 
rword count 



;TRUE for fail 
r clear ChkSum 
rfor jmpfdec 



RS 1: 



load CD, ROM_CTL, RS_SumTmp, RS_StrtAdd 

add RS_ChkSum, RS_ChkSum, RS_SumTmp 

jmpfdec RS_WrdCnt , RS_1 

add RS_StrtAdd,RS_StrtAdd, 4 

cpneq RS_Fail , RS_ChkSum, 

jmpf RS_Fail,RS_EXIT 
nop 

call lrO,ROMErr 

nop ;0/P para — ChkSum 

const RS_Fail,TRUE 

consth RS Fail, TRUE 



radd to ChkSum 

rnext ROM addr 

;if ChkSum — then 

;RS PASS else RS ERR 



:call ROMErr routine 
;TRUE for test fail 
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RS_EXIT: 


EPILOGUE 






•eject 






.sbttl 


"RAM 01 Test" 




FUNCTION 


RAM01,2,0,3 



This routine tests the RAM by the following method set all RAM area to then check 
for 0. set all RAM area to 1 then check for 1. 



In: 
Out; 



(see below) 

(see below) 

.reg 
• reg 
. reg 
.reg 
. reg 
. reg 
. reg 
xor 



R01_StrtAdd,%%(IN_PRM + 0) 
R01_WrdCnt,%% (IN_PRM + 1) 
R01_TmpCnt,%%(TEMP_REG + 0) 
R01_NxtAdd, %%(OUT_PRM + 0) 
R01_WrtPat,%% (OUT_PRM + 1) 
R01_RedPat,%% (OUT_PRM + 2) 
ROl Fail, %% (RET VAL + 0) 



; starting address 
; count of words 
; counter 
; error addres 

; pattern written 
; pattern read 
;TRUE for fail 



ROl- WrtPat,R01 WrtPat,R01 WrtPat ;0 to start 



ROl 0: 



add 
sub 



R01_NxtAdd, R01_StrtAdd, 
R01_TinpCnt, R01_WrdCnt, 2 



; set O's or I's 
;get strt RAM addr 
; for jmpfdec 



ROl 1: 



store CD, DATA_CTL, R01_WrtPat, R01_NxtAdd 

jmpfdec R01_TmpCnt,R01_l 

add R01_NxtAdd, R01_NxtAdd, WRD_SIZ 

add R01_NxtAdd, R01_StrtAdd, 

sub R01_TmpCnt,R01_WrdCnt,2 



/check for O's or 1' 
;get strt RAM addr 
; for jmpfdec 



load 

cpneq 

jmpt 

nop 

jmpfdec 

add 

cpeq 

jmpt 

nor 

jmp 

nop 



CD, DATA_CTL, R01_RedPat, R01_NxtAdd 
R01_Fail,R01_RedPat,R01_WrtPat ;err if neq 
ROl Fail, ROl ERR 



R01_TmpCnt,R01_2 

R01_NxtAdd, R01_NxtAdd, WRD_SIZ 

R01_Fail,R01_WrtPat,0 

R01_Fail,R01_0 

R01_WrtPat,'R01_WrtPat,R01_WrtPat 

ROl EXIT 



;if WrtPat = then 
;R01_0 else done 
r invert ptrn 

/pass and 1 test 



ROl ERR: 



call 
nop 
const 
consth 



lrO,RAMErr 

R01_Fail,TRUE 
ROl Fail, TRUE 



;0/P Parms — NxtAdd, WrtPat, RedP at 



rTRUE for test fail 



EPILOGUE 



.eject 
.sbttl 



"RAM Checker Pattern Test' 
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FUNCTION RAMChkr,2,0,3 

This routine will run a two-pass checkerboard on RAM. It will be controlled by 
input values specifying the base address and the count of locations to be tested. 

In: (see below) 



Out; 



(see below) 

. reg 

. reg 

. reg 

.reg 

.reg 

.reg 

.reg 

. reg 

const 

consth 



RC_StrtAdd,%%(IN_PRM + 0) 
RC_WrdCnt,%%(IN_PRM + 1) 
RC_TmpCnt,%%(TEMP_REG + 0) 
RC_StrtPat,%%(TEMP_REG + 1) 
RC_NxtAdd,%% (OUT_PRM + 0) 
RC_WrtPat,%%(OUT_PRM + 1) 
RC_RedPat,%% (OUT_PRM + 2) 
RC_Fail,%%(RET_VAL + 0) 
RC_StrtPat, CHKPAT_a5 
RC StrtPat,CHKPAT a 5 



.•starting address 

; count of words 

; total test word count 

/starting pattern 

; error address 

/•pattern written 

/pattern read 

;TRUE for fail 

; start with a5 



RC 1; 



add 
sub 
add 



RC_NxtAdd,RC_StrtAdd, 
RC_TmpCnt , RC_WrdCnt , 2 
RC WrtPat,RC StrtPat,0 



rfill memory with pattern 

:get start address 

: for jmpf dec 

: set the pattern 



RC 2 I 



store 0,0, RC_WrtPat , RC_NxtAdd 

R_LEFT RC_WrtPat 

jmpf dec RC_TmpCnt , RC_2 

add RC_NxtAdd,RC_NxtAdd,4 

add RC_NxtAdd,RC_StrtAdd,0 

sub RC_TmpCnt , RC_WrdCnt , 2 

add RC WrtPat,RC StrtPat,0 



; rotate ptrn left 

rnext test mem addr 

; check memory for pattern 

; get start address 

; for jmpfdec 

; set the pattern 



load 

cpneq 

jmpt 

nop 

R_LEFT 

jmpfdec 

add 



CD, DATA_CTL, RC_RedPat , RC_NxtAdd 
RC_Fail , RC_RedPat , RC_WrtPat 
RC_Fail,RC_ERR 

RC_WrtPat 
RC_TmpCnt , RC_3 
RC_NxtAdd, RC_NxtAdd, 4 

RC StrtPat,RC StrtPat,0 





jiiipi. 






nop 






jmp 


RC_1 




nop 




RC ERR: 








call 


lrO,RAMErr 




nop 






const 


RC Fail, TRUE 




consth 


RC_Fail,TRUE 


RC_EXIT: 


EPILOGUE 





rerr if neq 



; rotate ptrn left 

rnext test mem addr 

: invert ptrn for next pass 

r invert initial 

rdone if msb = 1 

rtry with inverted 



set after call 
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.eject 
.sbttl 



'RAM Address Pattern Test' 



FUNCTION RAMAddr,2,0,3 

This routine will run a two-pass test on RAM. It will be controlled by input values 
specifying the base address and the count of locations to be tested. In the first 
pass, the data will be set equal to the address. In the second pass, the data will 
be set equal to the complement of the address. 



In: 



(see below) 

(see below) 

.reg 

.reg 

.reg 

. reg 

• reg 

. reg 

. reg 

.reg 

.reg 

add 

const 



RA_StrtAdd 
RA_WrdCnt , 
RA_TmpCnt , 
RA_StrtPat 
RA_PtrnInc 
RA_NxtAdd, 
RA_WrtPat, 
RA_RedPat, 
RA_Fail,%% 
RA_StrtPat 
RA Ptrninc 



, %% (IN_PRM + 0) 
%%(IN_PRM + 1) 
%%(TEMP_REG + 0) 
, %% (TEMP_REG + 1) 
, %% (TEMP_REG 
%%(OUT_PRM + 
%%(OUT_PRM + 
%% (OUT_PRM + 
(RET_VAL + 0) 
, RA_StrtAdd,0 
,4 



+ 2) 
0) 
1) 
2) 



/starting address 
; count of words 
; total test word count 
; starting pattern 
;ptrn increment value 
; error address 
/pattern written 
/pattern read 
;TRUE for fail 
/start with address 



RA 1: 



RA 2: 



add RA_NxtAdd,RA_StrtAdd, 

sub RA_TmpCnt , RA_WrdCnt , 2 

add RA_WrtPat,RA_StrtPat, 

store 0,0,RA_WrtPat,RA_NxtAdd 

add RA_WrtPat,RA_WrtPat,RA_PtrnInc 

jmpfdec RA_TmpCnt , RA_2 

add RA NxtAdd,RA NxtAdd, 4 



rfill memory with pattern 

:get start address 

;for jmpfdec 

: set the pattern 



rnext test mem addr 



chec)c memory for pattern 

add RA_NxtAdd,RA_StrtAdd,0 
sub RA_TmpCnt , RA_WrdCnt , 2 
add RA WrtPat,RA StrtPat,0 



rget start address 

r for jmpfdec 

; set the pattern 



RA 3: 



load 

cpneq 

jmpt 

nop 

add 

jmpfdec 

add 

nor 

cpneq 

jmpt 

subr 

jmp 

nop 



CD , DATA_CTL , RA_RedP at , RA_Nxt Add 
RA_Fail , RA_RedPat , RA_WrtPat 
RA_Fail,RA_ERR 

RA_WrtPat , RA_WrtPat , RA_Pt rninc 
RA_TmpCnt , RA_3 
RA_NxtAdd, RA_NxtAdd, 4 

RA_StrtPat, RA_StrtPat, 
RA_Fail , RA_StrtPat , RA_StrtAdd 
RA_Fail,RA_l 
RA_PtrnInc,RA_PtrnInc,0 
RA EXIT 



:err if neq 



:next test mem addr 

; invert ptrn for next pass 

r invert initial 



r negate inc value 
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RA ERR: 






call 




nop 




const 




consth 


RA EXIT: 






EPILOGUE 



lrO,RAMErr 

RA_Fail,TRUE 
RA Fail, TRUE 



; set after call 



.eject 
.sbttl 



'Invalid Trap Handler" 



InvalidTrapHandler : 

This routine receives control when an invalid trap occurs. It will reinitialize 
a register frame for use in error reporting. It then reports the fact that an 
invalid trap has occurred. Reporting of specific trap numbers could be achieved, 
but at considerable cost in size. The use of an instrument such as the ADAPT2 9K™ 
is recommended for invalid trap identification. If that is not practical, this 
handler (or some other) could be extended to report numbers. It would require 2K 
bytes of additional code (jmp/const for each of 256 vectors) . 



rRE,PD,PI,SM,DI,DA 

: set up temp reg frame 

: room for linkage 
r ready to report errors 
: small frame required 
; show trap error 



mtsrim 


cps, 0x173 


const 


rfb,512 


const 


rab,0 


sub 


rsp, rfb, 8 


call 


lrO,SerInit 


add 


lrl,rfb,0 


call 


lrO,TrapErr 


nop 




halt 




nop 




.eject 




.sbttl 


"Vector Initializatic 


LEAF 


VectInit,0 



This routine initializes the vector table and vab. All vectors 
are set to point to the invalid trap handler in ROM. 



.reg VI_Vect, %% (TEMP_REG + 0) 

.reg VI_VectSt , %% (TEMP_REG + 1) 

.reg VI_VectCnt, %% (TEMP_REG + 2) 

mtsrim vab,0 

mfsr VI_VectSt, vab 

const VI_Vect, (InvalidTrapHandler 

consth VI_Vect, InvalidTrapHandler 

const VI VectCnt, (256 - 2) 



2) 



r vector value 

; vector storage address 

r vector count register 



•for jmpfdec 



VI_Loop: 



store 
jmpfdec 
add 
EPILOGUE 



0,0, VI_VectSt , VI_Vect 
VI_VectCnt, VI_Loop 
VI VectSt,VI VectSt,4 



• store the vector 
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•eject 

.sbttl 



"Boot'' 



Boot: 



This routine receives control upon a hardware reset. Its purpose 

is to establish the execution environment for the main program. This involves 

transcriptions of data and possibly code. The transcriptions may 

take the form of executing code since the ROM may not be readable. 



.reg RI_Ret, %% (TEMP_REG + 0) 

mtsrim cps, 0x173 

const rfb,512 

const rab,0 

sub rsp,rfb,16 

add lrl,rfb,0 

call lrO,SerInit 

nop 

const pi, (RAM_SIZE » 2) 

consth pi, (RAM_SIZE » 2) 

call lrO,RAMAddr 

const pO, 

call lrO,VectInit 

nop 

call RI_Ret,RAMInit 

mtsrim ops, 0x473 

mtsrim cps, 0x473 

const lrO,TextBas 

consth lrO,TextBas 

mtsr pcl,lrO 

add Ir0,lr0,4 

mtsr pcO,lrO 

iretinv ; go to inst space, TextBas 



;RAMInit return 

;RE,PD,PI,SM,DI,DA 

; set up temp reg frame 

; enough for pO and pi 

; ready to report errors 

;test full RAM size 

; just use one test 
;test from address zero 
/invalid traps 

/•initialize RAM 
;FZ,PD,PI,SM,DI,DA 
;FZ,PD,PI,SM, DI,DA 
; (using IrO as temp) 



end of boot . s 
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APPENDIX B: start.s 

•title "Start and Other Assembly-language Routines" 

Copyright 1988, Advanced Micro Devices, Inc. 

Written by Gibbons and Associates, Inc. 

HISTORY: 

1.3 29 July 88 E M Greenawalt SPR 0001 

Fixed shift count on line 1034 

This module provides initializations and trap handling for a program written in C 
and operating in a stand alone environment. It is designed for compatibility with 
the ADAPT29K and various Am29000 monitors. 

In this module, the first 16 system registers (gr64-gr79) are available for use as 
system statics. They are not used in any of the routines in this file. Their 
values are not saved and restored in the C interrupt handler interrupts, so they 
are truly static. 

The second 16 system registers (gr80-gr95) are used as temporary registers by trap 
handlers, etc., in this module. No such trap handler is itself interruptable . No 
presumption is made about the preservation of values in these registers by any 
program. 

.extern _main ;the C main routine 

.global V_SPILL ; the spill/fill vectors 

.global V_FILL 

NOTE: The equates below define the padding in the vector 
section (to a full page) , and constants related to 
the page size. The register and memory stack size 
are also declared. 

When operating with a monitor, the VECT_PAD may need to be increased. 

.equ PS, 3 ;page size designation 

.equ RPN_SHIFT, (10 + PS) 

.equ PAGE_SIZE, (1 « RPN_SHIFT) 

.equ MMU_PS, (PS « 8) 

.equ RPN_MASK, (~ (PAGE_SIZE - 1)) 

.equ VECT_PAD, (PAGE_SIZE - 0x400) 

.equ RSTK_SIZE,PAGE_SIZE 

.equ MSTK_SIZE,PAGE_SIZE 

.include "romdcl.h" 



NOTE: The equates below define traps for divide by zero 
and divide overflow. They are not standard. They 
are not handled here. 



equ 


V DIV0,80 


equ 


V DIV0V,81 


eject 




sbttl 


"Section D 



; divide by zero 
; divide overflow 
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Sections will be ordered in memory as shown below. 

vectors (at 0) 

rstack (register stack) 

mstack (memory stack) 

• data 

• bss 
.text 

endsect (dummy for establishing bounds) 

Vectors will be initialized by start-up code with pointers to an invalid trap 
handler in ROM. The initialization code will explicitly intercept those vectors 
that will be handled. 



. sect 
. sect 
. sect 
. sect 



vectors, bss 
rstack, bss 
mstack, bss 
endsect, bss 



The declarations that follow suggest the order of the segments, provide base 
names for each, and allocate sizes for the vectors and stacks. 

Jump instructions are also provided at the base of the .text section for ease 
in linkage to the Start routine and the special routine which provides for 
ADAPT29K initializations. 





.use 


vectors 




.block 


(4 * 256) 




.block 


VECT_PAD 




.use 


rstack 


RStkBase: 








.block 


RSTK_SIZE 


RStkTop: 








.use 


mstack 


MStkBase: 








.block 


MSTK_SIZE 


MStkTop: 


.data 




DataBase: 


.bss 


;base of init 


BSSBase: 


.text 




TextBase: 




;base of .text 




jmp 


Start 




nop 






jmp 


Adapt Init 




nop 






.use 


endsect 



;base of BSS data 



; allows easy linkage to Start 

;for bootstrap code 

; makes Adaptlnit easier to find 
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se: 








.block 


4 




.text 






.eject 






.sbttl 


"Timer read/write functions 




.global 


GetTmCnt 




.global 


SetTmCnt 




.global 


GetTmRld 




.global 


_SetTmRld 




LEAF 


GetTmCnt , 



/marks end of .text 

; dummy to assure existence 

; switch back to text 



This routine returns the timer/counter register value. All the fields are returned; 
i.e., no mask is applied. 

In: (nothing) 

Out: (see below) 

.reg GTC_Val, %% (RET_VAL + 0) 

mfsr GTC_Val,tmc 

EPILOGUE 



; timer reg value 



LEAF _SetTmCnt , 1 

This routine sets the timer/counter register value. All the fields are set; 
i.e., no mask is applied. 

In: (see below) 

Out: (nothing) 



.reg STC_Val, %% (IN_PRM + 0) 

mtsr tmc,STC_Val 

EPILOGUE 



; timer reg value 



LEAF 



GetTmRld, 



This routine gets the current contents of the timer reload register. No masks 
are applied. 

In: (nothing) 

Out: (see below) 



.reg GTR_Val, %% (RET_VAL + 0) 

mfsr GTR_Val,tmr 

EPILOGUE 



/timer reload value 



LEAF _SetTmRld,l 

This routine sets the timer/counter reload value. All the fields are set; 
i.e., no mask is applied. 

In: (see below) 
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Out: 



(nothing) 
• reg 

EPILOGUE 



STR_Val,%%(IN_PRM + 0) 
tmr,STR Val 



; timer reload value 



•eject 
.sbttl 



'32-bit Time Extensions" 



The routines below extend the timer counter to 32 bits via a trap handler. The 
32-bit value may be initialized and read by C-callable routines declared as 
globals. The trap handler is also included. Note that the caller of the C routines 
must be running in supervisor mode. 



; reserve a word for extension 
; switch back 



This routine clears the 32-bit extended counter by setting the tmc, tmr and 
software extension value. The timer interrupt is also enabled in tmr. 



.global 


ClrTm32 


.global 


GetTm32 


.bss 


; switch to declare bss 


TimeUpper: 




.block 


4 


.text 




LEAF 


ClrTm32,0 



In: (nothing) 
Out: (nothing) 



Temp: 



-reg 

.reg 

const 

consth 

mtsr 

consth 

mtsr 

const 

consth 

const 

store 

EPILOGUE 



CTVal,%%(TEMP_REG + 0) 

CTUpPt,%% (TEMP_REG + 1) 

CTVal,Oxffffff 

CTVal,Oxffffff 

tmc,CTVal 

CTVal,Oxlffffff 

tmr,CTVal 

CTUpPt, TimeUpper 

CTUpPt, TimeUpper 

CTVal,0 

0,0,CTVal, CTUpPt 



(timer initialized to zero) 

(see below) 

; timer reg value 

; upper pointer 

;for tc and TimeUpper 

; should keep it busy 
; set ie 



;no extension 



LEAF _GetTm32 , 

This routine returns a 32-bit clock counter. The clock counter is implemented 
by extending the hardware counter in software and negating the value before it is 
returned. The negation causes the returned value to be an up counter of the time 
since the counter was last reset. The low-level timer access routines may be used 
in initializations to assure a desired starting value. 

The software extension to 32 bits introduces a coordination problem in reading 
the counter's value. This is resolved by reading the upper 8 bits both before 
and after the TC value. If the TC value is greater than 2**23, the second upper 
value read is presumed to be correct. Lengthy interruptions of this routine 
(> 2**21 clocks) could cause errors. 



In: 



(nothing) 
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(see below) 



Temp: 





.reg 


TUpPt,%%(TEMP_REG + 0) 




• reg 


TUprl,%%(TEMP_REG + 1) 




.reg 


TUpr2,%%(TEMP_REG + 2) 




• reg 


TLwr, %% (TEMP REG + 3) 




• reg 


TChk,%%(TEMP_REG + 4) 




.reg 


T32,%%(RET_VAL + 0) 




const 


TUpPt,TimeUpper 




consth 


TUpPt , TimeUpper 




load 


0,0,TUprl,TUpPt 




add 


TUprl,TUprl,0 




mfsr 


TLwr,tmc 




load 


0,0,TUpr2,TUpPt 




sll 


TChk,TLwr,8 




jmpf 


TChk,GT Exit 




or 


T32,TLwr,TUprl 




or 


T32,T32,TUpr2 


GT Exit: 








subr 


T32,T32,0 




EPILOGUE 





(see below) 



/upper time pointer 

; upper time bits - 1st read 

; upper time bits - 2nd read 

; lower time bits - from cntr 

;temp to check high bit 

; 32-bit time value 

; get upper 8 bits of timer 



;hold till load complete 

rget upper 8 bits again 
:is upper TC bit set? 
r if not, use 1st read 
:poss ovfl before 2nd read 
:poss ovfl after 1st read 



r negate to count up from zero 



TimerHandler: 

This routine handles the timer trap. The timer trap will occur at intervals in the 
range of a second (depending on the actual clock speed) . The extension to 32 bits 
makes the timer somewhat more useful for common benchmarks. A different scheme 
would be required for longer intervals. 



;temp for tmr (shared) 
/pointer to upper 8 bits 
/upper 8-bit value 

/ clear out upper tmr bits 
/leaving ie alone 

/decrement the upper bits 



.reg 


THTr, %% (SYS TEMP + 0) 


.reg 


THUpPt,%%(SYS_TEMP + 0) 


.reg 


THUpVl,%% (SYS_TEMP + 1) 


mfsr 


THTr, tmr 


sll 


THTr, THTr, 7 


srl 


THTr, THTr, 7 


mtsr 


tmr, THTr 


const 


THUpPt, TimeUpper 


consth 


THUpPt, TimeUpper 


load 


0,0,THUpVl,THUpPt 


srl 


THUpVl,THUpVl,24 


sub 


THUpVl,THUpVl,l 


sll 


THUpVl,THUpVl,24 


store 


0,0,THUpVl, THUpPt 


iret 


/ done 


.eject 




.sbttl 


"C Interrupt Handler In 


.global 


CIntf 



CIntf : 

This routine is used to call a C routine that will handle an interrupt. In order 
to accomplish this, the context of the current program must be saved prior to the 
call and restored after the call. It is relatively expensive. In many 
instances, it may be best to write the interrupt handlers in assembly-language. Note 
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that assembly-language handlers will have the system statics available to retain 
state information. Note also that system statics are not saved and restored here. 
They are "static." 

This routine receives as inputs the address of the C routine and the vector number. 
It passes the vector number to the C routine as its only parameter. An initial 
stack of 16 registers (including inputs) is provided to the C routine. 



In: 



Out: 



Temp: 



(SYS_TEMP + 0) 
(SyS_TEMP + 1) 

(nothing) 



(SYS_TEMP 


2-13) 






(see below 


) 






.reg 


CI_Rout,%%(SYS_TEMP + 0) 


.reg 


CI_Vect,%%(SYS_TEMP + 1) 


.reg 


CI_Stk,%% (SYS_ 


TEMP + 


14) 


. reg 


CI_Frm, %% (SYs] 


_TEMP + 


14) 


mf sr 


st2,ops 






mf sr 


st3,cha 






mf sr 


st4,chd 






mfsr 


st5,chc 






mf sr 


st6,pc0 






mfsr 


st7,pcl 






mfsr 


st8,ipc 






mfsr 


st9,ipa 






mfsr 


stlO,ipb 






mfsr 


stll,q 






mfsr 


stl2,alu 






add 


stl3,rsp, 






mtsrim 


cps,0x73 






sub 


msp,msp, ( (64 - 


- 16) * 


4) 


const 


CI_Stk,MStkBase 




consth 


CI_Stk,MStkBase 




asge 


V_DataTLBProt, 


msp,CI_ 


Stk 


store 


0,0,gr80,msp 






mtsr 


im 






storem 


0,0,gr80,msp 






add 


rfb,rsp,0 






const 


CI_Frm,512 






sub 


rab,rfb,CI_Frm 




add 


rsp,rab, (13 * 


4) 




sub 


msp,msp, (16 * 


4) 




mtsr 


im 






storem 


0,0,rab,msp 






add 


lrl,rfb,0 






add 


pO,CI_Vect,0 






calli 


lrO,CI_Rout 






mtsrim 


cps,0xl3 






mtsrim 


cps,0x73 






sub 


rab,rsp, (13 * 


4) 




mtsrim 


CR, (16 - 1) 






loadm 


0,0,rab,msp 






add 


msp,msp, (16 * 


4) 




mtsrim 


CR, ((64 - 16) 


- 1) 




loadm 


0, 0, gr64,msp 






add 


msp,msp, ( (64 - 


■ 16) * 


4) 


mtsr 


ops, st2 







C routine address 
vector number 



used to hold specials 



;the C routine 
;the vector 
; stack check value 
; frame size (shared) 
; save specials temps 



;PD,PI,SM,DI,DA 

/allocate space for globals 

; check for overflow 

/simulate Prot (no return on fail) 

; flush for CPU bug 

CR, ((64 - 16) - 1) 

; save the globals 

;move down the frame 

/•beneath rsp 

;set rsp in 16 reg frame 
; save the frame 
CR, (16 - 1) 

/require remaining locals 

/vector is output parm 

;call the handler 

/with prot and no ints (no good 

/ for more complex TLB schemes) 

/ready to reload 

/reload locals in frame 



: reload globals 



■restore specials 
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mtsr 


cha,st3 


mtsr 


chd,st4 


mtsr 


chc, st5 


mtsr 


pcO,st6 


mtsr 


pcl,st7 


mtsr 


ipc, st8 


mtsr 


ipa,st9 


mtsr 


ipb,stlO 


mtsr 


q,stll 


mtsr 


alu,stl2 


add 


rsp,stl3, 


iret 


/return from int 



.eject 

.sbttl "Multiply and Divide Handlers' 



MultiplyHandler: 

This trap handler performs the (signed) operation: 
DEST//Q <- SRCA * SRCB. 

IPC, IPA, and IPB are set by the MULTIPLY instruction prior to the invocation of 
this trap handler. 

In: 



Out: 



Temp: 



IPC 


DEST 




IPA 


SRCA 




IPB 


SRCB 




DEST//Q 


IPB = IPC 




(see below 


) 




• reg 


MH_IP,%%(SYS_TEMP 


+ 0) 


mtsr 


q,grO 




mf sr 


MH_IP , ipc 




mtsr 


ipb, MH_IP 




mul 


grO,grO,0 




mul 


grO,grO,grO 




mul 


grO, grO, grO 




mul 


grO, grO, grO 




mul 


grO,grO,grO 




mul 


grO, grO, grO 




mul 


grO,grO,grO 




mul 


grO,grO,grO 




mul 


grO,grO,grO 




mul 


grO,grO,grO 




mul 


grO,grO,grO 




mul 


grO, grO, grO 




mul 


grO,grO, grO 




mul 


grO,grO,grO 




mul 


grO,grO,grO 




mul 


grO, grO, grO 




mul 


grO,grO,grO 




mul 


grO,grO, grO 




mul 


grO, grO, grO 




mul 


grO,grO,grO 




mul 


grO, grO,grO 




mul 


grO, grO, grO 




mul 


grO, grO,grO 




mul 


grO,grO,grO 




mul 


grO,grO,grO 





(unimportant side effect) 



temp for move operation 
SRCB (multiplier) to Q 
use a system temp to set 

ipb = ipc 
step 1 . (no initial prod) 
step 2. 
step 3 . 
step 4 . 
step 5. 
step 6. 
step 7. 
step 8 . 
step 9. 
step 10. 
step 11. 
step 12 . 
step 13 . 
step 14. 
step 15. 
step 16. 
step 17. 
step 18. 
step 19. 
step 20. 
step 21. 
step 22. 
step 23. 
step 24. 
step 25. 



3-143 



29K Family Application Notes 



mul 


grO,grO,grO 


mul 


grO,grO,grO 


mul 


grO,grO,grO 


mul 


grO,grO,grO 


mul 


grO,grO,grO 


mul 


grO,grO, grO 


mull 


grO,grO,grO 


iret 


; done 



;step 26. 
;step 27. 
;step 28. 
; step 29. 
;step 30. 
; step 31. 
; step 32 . 



This trap handler performs the (unsigned) operation 
DEST//Q <- SRCA * SRCB. 

IPC,IPA,and IPB are set by the MULTIPLU instruction prior to 
the invocation of this trap handler. 



In: 



Out: 



Temp: 



IPC 


DEST 


IPA 


SRCA 


IPB 


SRCB 


DEST//Q 




IPB = IPC 


(unimportant side 


(see below 


> 


.reg 


MU_IP,%%(SYS_TEMP 


mtsr 


q,grO 


mfsr 


MU_IP,ipc 


mtsr 


ipb,MU_IP 


mulu 


grO,grO,0 


mulu 


grO,grO,grO 


mulu 


grO, grO, grO 


mulu 


grO,grO,grO 


mulu 


grO,grO,grO 


mulu 


grO,grO,grO 


mulu 


grO,grO,grO 


mulu 


grO,grO,grO 


mulu 


grO,grO,grO 


mulu 


grO,grO,grO 


mulu 


grO,grO,grO 


mulu 


grO,grO,grO 


mulu 


grO,grO,grO 


mulu 


grO,grO,grO 


mulu 


grO,grO,grO 


mulu 


grO,grO,grO 


mulu 


grO,grO,grO 


mulu 


grO,grO,grO 


mulu 


grO,grO,grO 


mulu 


grO,grO,grO 


mulu 


grO, grO,grO 


mulu 


grO,grO,grO 


mulu 


grO,grO,grO 


mulu 


grO,grO,grO 


mulu 


grO,grO,grO 


mulu 


grO,grO,grO 


mulu 


grO,grO,grO 


mulu 


grO,grO, grO 


mulu 


grO,grO,grO 


mulu 


grO,grO,grO 


mulu 


grO,grO,grO 


mulu 


grO,grO,grO 


iret 


; done 



+ 0) 



temp for move operation 
SRCB (multiplier) to Q 
use a system temp to set 

ipb = ipc 
step 1. (no initial prod) 
step 2. 
step 3 . 
step 4 . 
step 5. 
step 6. 
step 7 . 
step 8 . 
step 9. 
step 10. 
step 11. 
step 12. 
step 13. 
step 14. 
step 15. 
step 16. 
step 17. 
step 18. 
step 19. 
step 20. 
step 21. 
step 22. 
step 23. 
step 24. 
step 25. 
step 26. 
step 27. 
step 28. 
step 29. 
step 30. 
step 31 . 
step 32. 
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DivideHandler: 

This trap handler performs the (signed) operation: 
DEST <- (SRCA//Q) / SRCB 

IPC,IPA, and IPB are set by the DIVIDE instruction prior to 
the invocation of this trap handler. 



IPC 


DEST 


IPA 


SRCA 


IPB 


SRCB 


Q 





DEST 



Temp: (see below) 



.reg D_Rmdr, %% (SYS_TEMP + 0) 

.reg D_Dvsr, %% (SYS_TEMP + 1) 

.reg D_Sign, %% (SYS_TEMP + 2) 

.reg D_DvdHi, %% (SYS_TEMP + 3) 

.reg D_DvdLo, %% (SYS_TEMP + 4) 

.reg D_Quot, %% (SYS_TEMP + 5) 

.reg D_Ovf 1, %% (SYS_TEMP + 6) 

.reg D_MnNg, %% (SYS_TEMP + 7) 

add D_DvdHi,grO, 

mfsr D_DvdLo,q 

sub D_Dvsr,D_Dvsr, 

add D_Dvsr,D_Dvsr,grO 

asneq V_DIVO, D_Dvsr, 



r shift area and remainder 

; divisor 

■0 for positive 

; dividend high 

: dividend low 



rmost negative integer 
r SRCA is dividend high 
rQ is dividend low 
: divisor is in SRCB 
rany easier access? 
r check for divisor zero 



DividendCheck : 
jmpf 
const 
cpeq 
subr 
subrc 



D_DvdHi,DivisorCheck 
D_Sign, FALSE 
D_Sign, D_Sign, 
D_DvdLo , D_DvdLo , 
D DvdHi,D DvdHi,0 



: toggle flag 

: negate dividend 



DivisorCheck: 
jmpf 
nop 
cpeq 
subr 



DivideOp: 



mtsr 

divO 

div 

div 

div 

div 

div 

div 

div 

div 

div 

div 

div 

div 



D_Dvsr, DivideOp 

D_Sign, D_Sign, 
D Dvsr,D Dvsr,0 



q,D_DvdLo 
D_Rmdr, D_DvdHi 
D_Rmdr, D_Rmdr, D_ 
D_Rmdr, D_Rmdr, D_ 
D_Rmdr, D_Rmdr , D_ 
D_Rmdr, D_Rmdr , D_ 
D_Rmdr, D_Rmdr, D_ 
D_Rmdr, D_Rmdr, D_ 
D_Rmdr, D_Rmdr, D_ 
D_Rmdr, D_Rmdr, D_ 
D_Rmdr, D_Rmdr, D_ 
D_Rmdr, D_Rmdr , D_ 
D_Rmdr, D_Rmdr, D_ 
D Rmdr,D Rmdr,D 



Dvsr 
Dvsr 
Dvsr 
Dvsr 
Dvsr 
Dvsr 
Dvsr 
Dvsr 
Dvsr 
Dvsr 
Dvsr 
Dvsr 



toggle flag 


negate divisor 


dividend low to q 


D_Rmdr becomes shift high 


step 


1. 


step 


2. 


step 


3. 


step 


4. 


step 


5. 


step 


6. 


step 


7. 


step 


8. 


step 


9. 


step 


10. 


step 


11. 


step 


12. 
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div 


D_Rmdr, D_Rmdr, D_Dvsr 




div 


D_Rmdr, D_Rmdr, D_Dvsr 




div 


D_Rmdr, D_Rmdr, D_Dvsr 




div 


D_Rmdr, D_Rmdr, D_Dvsr 




div 


D_Rmdr, D_Rmdr, D_Dvsr 




div 


D_Rmdr, D_Rmdr, D_Dvsr 




div 


D_Rmdr, D_Rmdr, D_Dvsr 




div 


D_Rmdr, D_Rmdr, D_Dvsr 




div 


D_Rmdr, D_Rmdr, D_Dvsr 




div 


D_Rmdr, D_Rmdr, D_Dvsr 




div 


D_Rmdr, D_Rmdr, D_Dvsr 




div 


D_Rmdr, D_Rmdr, D_Dvsr 




div 


D_Rmdr, D_Rmdr , D_Dvsr 




div 


D_Rmdr, D_Rmdr, D_Dvsr 




div 


D_Rmdr, D_Rmdr, D_Dvsr 




div 


D_Rmdr, D_Rmdr, D_Dvsr 




div 


D_Rmdr, D_Rmdr, D_Dvsr 




div 


D_Rmdr, D_Rindr, D_Dvsr 




div 


D_Rmdr, D_Rindr, D_Dvsr 


; 


divrem 


D_Rmdr, D_Rmdr, D_Dvsr 




mfsr 


D_Quot,q 




cplt 


D_Ovfl,D_Quot,0 




jmpf 


D_Sign, DivideCorrect 




cpeq 


D_MnNg, D_MnNg, D_MnNg 




cpeq 


D_Ovf 1, D_MnNg, D_Quot 




cpneq 


D_Ovf 1 , D_Ovf 1 , D_Sign 


Div 


ideCorrect: 
jmpf 


D_Sign, DivideExit 




aseq 


V_DIVOV,D_Ovfl,0 




subr 


D_Quot , D_Quot , 


'■ 


subr 


D_Rmdr, D_Rmdr, 


DivideExit : 






add 


grO,D_Quot, 




iret 


; done 



rstep 13. 

; step 14 . 

rstep 15. 

rstep 16. 

rstep 17. 

rstep 18. 

rstep 19. 

;step 20. 

;step 21. 

;step 22. 

;step 23. 

rstep 24. 

; step 25. 

; step 26. 

; step 27 . 

;step 28. 

;step 29. 

;step 30. 

; step 31. 

; don't need remainder 

;get quotient out of c 

; check overflow 

; set most neg 
(•check for most neg 
; allow if to be neg 



;done if positive 
rtrap on overflow 
r negate quotient 
r don't need remainder 



:set DEST 



DividuHandler: 

This trap handler performs the (unsigned) operation: 
DEST <- (SRCA//Q) / SRCB 

IPC, IPA, and IPB are set by the DIVIDU instruction prior to 
the invocation of this trap handler. 



Out: 



Temp: 



IPC 


DEST 


IPA 


SRCA 


IPB 


SRCB 


Q 





DEST 

(see below) 

.reg 

add 

divO 

div 

div 

div 



DU_Rmdr, %% (SYS_TEMP 
DU_Rmdr,grO, 
DU_Rmdr,DU_Rmdr 
DU_Rmdr, DU_Rmdr, grO 
DU_Rmdr, DU_Rmdr, grO 
DU_Rmdr , DU_Rmdr, grO 



+ 0) 



; shift area and remainder 

;SRCA to DU_Rmdr 

;DU_Rmdr becomes shift high 

; step 1 . 

;step 2. 

; step 3 . 
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div 


DU_Rmdr, DU_Rmdr, grO 


div 


DU_Rnidr, DU_Rmdr, grO 


div 


DU_Rnidr, DU_Rmdr , grO 


div 


DU_Rindr, DU_Rmdr , grO 


div 


DU_Rnidr, DU_Rmdr, grO 


div 


DU_Rnidr, DU_Rmdr , grO 


div 


DU_Rmdr, DU_Rmdr , grO 


div 


DU_Rmdr, DU_Rmdr, grO 


div 


DU_Rmdr, DU_Rmdr, grO 


div 


DU_Rmdr, DU_Rmdr, grO 


div 


DU_Rmdr, DU_Rmdr, grO 


div 


DU_Rindr, DU_Rmdr, grO 


div 


DU_Rmdr, DU_Rmdr , grO 


div 


DU_Rmdr, DU_Rnidr, grO 


div 


DU_Rindr, DU_Rmdr, grO 


div 


DU_Rmdr , DU_Rmdr , grO 


div 


DU_Rmdr, DU_Rmdr, grO 


div 


DU_Rmdr, DU_Rmdr, grO 


div 


DU_Rmdr, DU_Rmdr, grO 


div 


DU_Rmdr, DU_Rmdr, grO 


div 


DU_Rindr, DU_Rmdr , grO 


div 


DU_Rmdr, DU_Rmdr, grO 


div 


DU_Rmdr, DU_Rmdr , grO 


div 


DU_Rmdr, DU_Rindr , grO 


div 


DU_Rindr, DU_Rmdr , grO 


div 


DU_Rmdr, DU_Rmdr , grO 


div 


DU_Rmdr , DU_Rmdr , grO 


div 


DU_Rmdr, DU_Rindr, grO 


divrem 


DU_Rmdr, DU_Rmdr, grO 


mf sr 


grO,q 


iret 


; done 


.eject 




•sbttl 


"Spill and Fill Handl 



step 4 . 

step 5. 

step 6. 

step 7 . 

step 8. 

step 9 . 

step 10. 

step 11. 

step 12. 

step 13. 

step 14. 

step 15. 

step 16. 

step 17. 

step 18. 

step 19. 

step 20. 

step 21. 

step 22. 

step 23. 

step 24. 

step 25. 

step 26. 

step 27. 

step 28. 

step 29. 

step 30. 

step 31. 

don't need remainder 

quotient to (ipc) 



; The routines below handle the allocation and free assertions 
; in subroutine prologues and epilogues. The temps they use 
; are given below. 



.reg R_Cnt, %% (SYS_TEMP + 0) 

.reg R_Bnd, %% (SYS_TEMP + 0) 

.reg R_TmpPC0, %% (SYS_TEMP + 1) 

.reg R_TmpPCl, %% (SYS TEMP + 2) 



;temp for count (shared) 

;temp for boundary 

;temp for PCO 

;temp for PCI 



SpillHandler: 

This routine handles a false assertion in the standard prologue 

In: rab > rsp (requiring an allocation) 

Irl <= rfb 

rfb == rab + 512 



Out: rab == rsp (just enough allocated) 



mfsr R_TmpPCO,pcO 

mfsr R_TmpPCl,pcl 

ratsrim cps,0x73 

sub R_Cnt , rab, rsp 

sub rfb, rfb, R Cnt 



Irl <= rfb 

rfb = rab +512 

; save the PCs 



PD,PI,SM,DI,DA 

R_Cnt = # of bytes to spill 

move down the frame bound 
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store 


0,0,lrO,rfb 


srl 


R Cnt.R Cnt,2 


sub 


R Cnt,R Cnt,l 


mtsr 


cr,R Cnt 


storem 


0,0,lrO,rfb 


add 


rab, rsp, 


const 


R Bnd,RStkBase 


consth 


R Bnd,RStkBase 


asge 


V_DataTLBProt , rab, R_Bnd 


mtsrim 


cps, 0x473 


mtar 


pcO,R_TmpPCO 


mtsr 


pcl,R_TnipPCl 


iret 




FillHandler: 





r flush for storem bug 

rR_Cnt = count of words to spill 

r correct for ^torem 

r set up count for storem 

r spill from the allocated area 

rmove down the allocate bound 

r check for possible overflow 

; simulate TLB prot 
;NOTE: no return on fail 
;FZ,PD,PI,SM,DI,DA 
(•restore the PCs 



This routine handles a false assertion in the standard epilogue 
In: Irl > rfb 



Irl 



rfb 



mf sr 


R_TmpPCO,pcO 


mfsr 


R_TmpPCl,pcl 


mtsrim 


cps, 0x73 


const 


R_Bnd,RStkTop 


consth 


R_Bnd,RStkTop 


asle 


V_DataTLBProt , rf b, R_Bnd 


const 


R_Cnt,512 


or 


R_Cnt , R_Cnt , rfb 


mtsr 


ipa,R_Cnt 


sub 


R_Cnt,lrl,rfb 


add 


rab,rab,R_Cnt 


srl 


R_Cnt , R_Cnt , 2 


sub 


R_Cnt , R_Cnt , 1 


mtsr 


cr,R_Cnt 


loadm 


0,0,grO,rfb 


add 


rfb, Irl, 


mtsrim 


cps, 0x473 


mtsr 


pcO,R_TmpPCO 


mtsr 


pcl,R_TmpPCl 


iret 




.eject 




.sbttl 


"TLB Miss Handler" 



(requiring deallocation) 
rsp >= rab 
rfb == rab + 512 

(just enough freed) 
rsp >= rab 
rfb = rab + 512 

; save the PCs 

;PD,PI,SM,DI,DA 

; check for possible underflow 

; simulate TLB prot 

;NOTE: no return on fail 

;make local reg ip 

; from rfb 

; set up indirect ptr for loadm 

;R_Cnt = # of bytes to fill 

;move up the allocate bound 

;R_Cnt = number of words to fill 

/correct for loadm 

; set up count for loadm 

;fill area freed 

; move up frame bound 

;FZ,PD,PI,SM, DI,DA 

; restore the PCs 



The routines below provide one-for-one TLBs, i.e., the virtual address is set equal 
to the physical address. A central routine is used to do the actual TLB update. 

Some enhancement would be appropriate to allow I/O access as data,; i.e., 
memory-mapped I/O. Speed improvements could be realized (four instructions) by the 
allocation and initialization of system registers for the bounds. 

The temp registers used are indicated below. 
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.reg TH_Ad, %% (SYS_TEMP + 0) 

.reg TH_Ac, %% (SYS_TEMP + 1) 

.reg TH_Bnd, %% (SYS_TEMP + 2) 

.reg TH_Reg, %% (SYS_TEMP + 3) 

.reg TH_WdO, %% (SYS_TEMP + 4) 

.reg TH Wdl,%%(SYS TEMP + 5) 



;the miss address 

;the required privileges 

/access bound 

;TLB register number 

;TLB word value 

;TLB word 1 value 



This routine handles supervisor instruction TLB misses. 

An attempted access out of range is treated as an instruction 
TLB protection violation. 



mfsr 


TH_Ad,pcl 


const 


TH Bnd, TextBase 


consth 


TH Bnd, TextBase 


asge 


V_InstTLBProt , TH_Ad, TH_Bnd 


const 


TH Bnd,EndBase 


consth 


TH Bnd,EndBase 


aslt 


V_InstTLBProt , TH_Ad, TH_Bnd 


jmp 


TLBHandler 


const 


TH_Ac, 0x4800 



;NOTE: no return on fail 



;NOTE: no return on fail 
;VE,SE 



SupDataTLBHandler: 

This routine handles the supervisor data TLB misses. It should 
be enhanced to allow I/O access as well as data access. 



mfsr TH_Ad, cha 

const TH_Ac, 0x7000 

const TH_Bnd,MStkBase 

consth TH_Bnd, MStkBase 

asge V_DataTLBProt , TH_Ad, TH_Bnd 

const TH_Bnd, TextBase 

consth TH_Bnd, TextBase 

aslt V InstTLBProt,TH Ad,TH Bnd 



;VE,SR, SW 



;NOTE: no return on fail 



;NOTE: no return on fail 
(drop through to TLB handler) 



TLBHandler: 

This routine handles TLB updates once it has been determined 
that the update is appropriate. 

NOTE: This routine presumes an 8K-byte page size. 



TH_Ad the address where access is required 

TH_Ac the access that is required 

Iru the recommended TLB for replacement 

(Iru) provides access to TH_Ad 

constn TH_Wdl , RPN_MASK 

s 1 1 TH_Wd0 , TH_Wdl , 5 

and TH_Wdl , TH_Wdl , TH_Ad 

and TH_WdO , TH_WdO , TH_Ad 

o r TH_Wd , TH_WdO , TH_Ac 

mfsr TH Reg, Iru 



r shift for vtag 

: establish addr fields 

restablish access 
; set the TLB entry 
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mttlb 
add 
mttlb 
iret 



TH_Reg, TH_WdO 

TH_Reg,TH_Reg,l 

TH_Reg,TH_Wdl 



.eject 

•sbttl "TLB Initialization" 

LEAF TLBInit,0 

This routine is used to initialize the TLBs. 

It clears all the TLB registers, thus marking all entries invalid. 

In: (nothing) 

Out: (nothing) 

Temps: (see below) 



TI_Loop: 



. reg 
. reg 
. reg 
const 
const 
const 



mttlb 

jmpfdec 

add 

EPILOGUE 



TI_Reg, %% (TEMP_REG + 0) 

TI_Val,%% (TEMP_REG + 1) 

TI_Cnt,%% (TEMP_REG + 2) 

TI_Reg, 

TI_Val,0 

TI Cnt, (TLB CNT - 2) 



TI_Reg, TI_Val 

TI_Cnt,TI_Loop 

TI_Reg,TI_Reg,l 



;the TLB register number 

;the TLB value (0) 

;the TLB register count 



;for jmpfdec 



.eject 
.sbttl 



'^Vector Initialization' 



In order that the vector initialization code might be compact 
and that the set of vectors initialized might be easily expanded, 
a table in .data is used. Each entry in the table has two words. 
The first word is the number of the vector to be initialized. The 
second word is the address of the handler. 



.data 



/switch to .data for table 



VectlnitTable: 



.word V_SupInstTLB, SupInstTLBHandler 

.word V_SupDataTLB, SupDataTLBHandler 

.word V_MULTIPLY,MultiplyHandler 

.word V_DIVIDE,DivideHandler 

.word V_MULTIPLU,MultipluHandler 

.word V_DIVIDU,DividuHandler 

.word V_SPILL,SpillHandler 

.word V_FILL,FillHandler 

.word V_Timer,TimerHandler 

.equ VINIT_CNT, ((. - VectlnitTable) 

.text 



/ 8) 



•switch back to .text for code 
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Vectlnit: 

This routine initialzes the vectors for which handlers exist. 
In: vab vector area base 



Out: 



Temp: 



(vectors initialized) 



(see below) 





.reg 


VI_Vect,%%(TEMP_REG + 0) 




.reg 


VI_St,%%(TEMP_REG + 1) 




.reg 


VI_Cnt,%%(TEMP_REG + 2) 




.reg 


VI_Base,%%(TEMP_REG + 3) 




.reg 


VI_TbPt,%%(TEMP_REG + 4) 




mfar 


VI_Base,vab 




const 


VI_Cnt, (VINIT_CNT - 2) 




const 


VI_TbPt , Vect InitTable 




consth 


VI_TbPt,VectInitTable 


VI_Loop: 








load 


0,0,VI_St,VI_TbPt 




add 


VI_TbPt,VI_TbPt,4 




sll 


VI_St,VI_St,2 




add 


VI_St , VI_St , VI_Base 




load 


0,0, VI_Vect , VI_TbPt 




add 


VI_TbPt,VI_TbPt,4 




jmpfdec 


VI_Cnt,VI_Loop 




store 


0,0,VI_Vect,VI_St 




jmpi 


IrO 




nop 






.eject 






.sbttl 


"ADAPT2 9K Initialization 


Adapt Init 


: 





/vector value 

; vector storage address 

; vector count 

/vector base 

/vector base 

/for jmpfdec 



; get the vector 

/convert to address (fixed vl.3) 

/get the handler 



This routine is for use in situations where the bootstrap process 
has not occurred. Instead, the ADAPT29K has been used to load 
the program. Initializations of the vectors, etc., will be 
required. 

As an aid to fault identification, the vector table is initialized 
with pointers to the words immediately following the vectors. These 
words are initialized with HALT instructions. When one of these 
halts executes, the ADAPT29K will report the event and the address 
of the halt. This will allow the invalid trap that has occurred 
to be identified. 



CAUTION! This 


requires that the vector pad 


. reg 


AI_Vect,%%(TEMP_REG + 0) 


.reg 


AI St, %% (TEMP REG + 1) 


.reg 


AI Cnt,%%(TEMP REG + 2) 


.reg 


AI Halt, %% (TEMP REG + 3) 


mtsrim 


cps,0x73 


mtsrim 


vab, 


mf sr 


AI St, vab 


const 


AI Vect, 1024 


const 


AI Halt, 0x89000000 



; vector value 

rvector storage address 

r vector count register 

;halt instruction register 

;PD,PI,SM,DI,DA 



just beyond vectors 
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AI_Loop: 



consth 
const 



store 

add 

store 

jmpfdec 

add 

jmp 

nop 



.eject 
. sbttl 



AI_Halt, 0x89000000 
AI Cnt, (256 - 2) 



0,0,AI_St,AI_Vect 

AI_St,AI_St,4 

0,0,AI_Vect,AI_Halt 

AI_Cnt , AI_Loop 

AI_Vect,AI_Vect,4 

Start 



:for jmpfdec 

; store the vector 
r store the HALT 



"Start' 



This routine receives control after any required bootstrap processes. It will 

initialize the vectors which are actually handled, clear the BSS area, initialize 

the TLBs, and establish initial stack pointers and an initial register frame. 
It will then invoke _main. 

In the event that _main returns, this routine will perform a warm start. 

In: vab indicates vector area 



Out: 



(nothing) 






mtsrim 


cps 


0x73 


mtsrim 


mmu 


MMU PS 


mtsrim 


cfg 


0x10 


const 


rfb 


RStkTop 


consth 


rfb 


RStkTop 


const 


rab 


(RStkTop 


consth 


rab 


(RStkTop 


add 


Irl 


rfb,0 


sub 


rsp 


rfb, 16 


const 


msp 


MStkTop 


consth 


msp 


MStkTop 


call 


IrO 


Vectlnit 


nop 






call 


IrO 


TLBInit 


nop 






call 


IrO 


ClrTm32 


nop 


; 


(leave to 


mtsrim 


cps 


0x10 


const 


lr2 





const 


lr3 





call 


IrO 


main 


nop 






mtsrim 


cps 


0x473 


mtsrim 


ops 


0x173 


mtsrim 


cfg 


1 


mtsrim 


chc 





mtsrim 


pel 





mtsrim 


pcO 


4 


iretinv 







512) 
512) 



_main ???) 



:PD,PI,SM,DI,DA 
r order # = 

rVF 

r set up stack pointers 



rlrO, Irl, argc,argv 

: install handled vectors 

r establish TLBs invalid 

r clear and enable timer 

:SM 

rargc = 
:argv = 



;FZ,PD,PI,SM,DI,DA 
:RE,PD,PI,SM,DI,DA 
r cache disabled 
r contents invalid 
rcold start address 



end of start. s 
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APPENDIX C: tests 



.title 



"Test of Assembly-language Utilities" 



Copyright 1988, Advanced Micro Devices, Inc. 
Written by Gibbons and Associates, Inc. 
•include "romdcl.h" 



•extern 


GetTm32 


• data 




• word 


OxDEADBEEF 


,bss 




.block 


1024 


.text 




•eject 




. sbttl 


"Multiply/Divide Test 


LEAF 


MultDiv,0 



;just to test 
; verify zeros 



This routine gives a test of the multiply and divide trap 
handlers by the simple expedient of performing one of each. 
Using the debugger, it can be forced to loop, etc. 



In: (nothing) 
Out: (nothing) 
Temp: 



M_Loop: 



D_Loop: 



.reg 


MD_Mpd, %% (TEMP_REG + 0) 


• reg 


MD_Mpr,%%(TEMP_REG + 1) 


• reg 


MD_PrLo,%%(TEMP_REG + 2) 


• reg 


MD_PrHi,%%(TEMP_REG + 3) 


• reg 


MD_Mlp, %% (TEMP_REG + 4) 


• reg 


MD_DvdHi,%%(TEMP_REG + 0) 


• reg 


MD_DvdLo,%%(TEMP_REG + 1) 


• reg 


MD_Dvsr,%% (TEMP_REG + 2) 


• reg 


MD_Quot,%%(TEMP_REG + 3) 


• reg 


MD_Dlp,%%(TEMP_REG + 4) 


const 


MD_Mlp, 


const 


MD_Mpd, 3 


consth 


MD_Mpd, 3 


const 


MD_Mpr, 5 


consth 


MD_Mpr, 5 


multiply 


MD_PrHi , MD_Mpd, MD_Mpr 


mf sr 


MD_PrLo,q 


jmpt 


MD_Mlp,M_Loop 


nop 




const 


MD_Dlp,0 


const 


MD_DvdHi,0 


consth 


MD_DvdHi,0 


const 


MD_DvdLo, 15 


consth 


MD_DvdLo, 15 


const 


MD_Dvsr,3 


consth 


MD_Dvsr,3 


mtsr 


q,MD_DvdLo 


divide 


MD_Quot , MD_DvdHi , MD_Dvsr 


jmpt 


MD_Dlp, D_Loop 



(see below) 



/multiplicand 

/multiplier 

/product low 

/product high 

/BOOLEAN for looping 

/ dividend high 

/dividend low 

/divisor 

/quotient 

/BOOLEAN for looping 

/ FALSE 

/ (full 32-bit for patching) 



/ FALSE 

/ (full setting for patch) 
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nop 

EPILOGUE 



.eject 

.sbttl "Spill/Fill Test' 



FUNCTION _Recurse, 1,29,1 ;ALLOC_CNT = 32 

This routine is a simple recursive do-nothing that is used to test 
spill/fill. 

It accepts a count as its input, decrements that count, and, if the 
result is zero or greater, calls itself with the now decremented 
count. Each instance of the routine allocates 32 new registers. 
Thus the total register requirement is 32 * (InCnt + 1) where InCnt 
is the input count. 

In: (see below) 

Out: (nothing in final return) 

Temp: (allocated but not used) 

.reg R_InCnt, %% (IN_PRM + 0) 

.reg R_OutCnt, %% (OUT_PRM + 0) 

sub R_OutCnt , R_InCnt , 1 

jmpt R_OutCnt,R_Exit 

nop 

call lrO,_Recurse 

nop 



EPILOGUE 



.eject 

•sbttl "C Interrupt Interface Test" 

.extern CIntf 



_Trap70,l 



This "C" routine handles trap 70. It increments the value of a global 
system register so that its effect may easily be seen. 



In: 


(see 


below 




Out: 


stO 




incremented 




stl 




set to input paramete 




. reg 




T70 V,%%(IN PRM + 0) 




add 




stO,stO,l 




add 




stl,T70 V,0 




EPILOGUE 





;the vector 



Trap70: 

; This is the assembly-language routine that should get control on 
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trap 70. It invokes CIntf in such a way as to give control to 
_Trap70, the "C" routine above. Note that control never returns 
to this routine. CIntf performs the iret . 



In: 
Dut: 


(nothing) 
(nothing) 






.reg 


T70 Rout, %% (SYS TEMP 




.reg 


T70 Vect,%%(SyS TEMP 




const 


T70_Rout,_Trap70 




consth 


T7 0_Rout , _Trap7 




jmp 


CIntf 




const 


T70_Vect,70 




.eject 






.sbttl 


" main" 




.global 


_main 




FUNCTION 


main, 2, 2,1 



This routine plays the role of a C main routine. It 
is coded in assembly language to ease testing with 
an absolute debugger. 



.reg 

. reg 

.reg 

.reg 

call 

nop 

add 

call 

nop 

call 

const 

asneq 

call 

nop 

add 

EPILOGUE 



argc, %% (IN_PRM + 0) 
argv, %% (IN_PRM + 1) 
StTm, %% (LOC_REG + 0) 
EndTm, %% (LOC_REG + 1) 
lrO,_GetTm32 

StTm, vO,0 
lrO,_MultDiv 

lrO,_Recurse 
p0,15 
70,grl,grl 
lrO,_GetTm32 

EndTm, vO, 



;argc (= 0) 

;argv (= NULL) 

; start time 

rend time 

r should return start time 

,• save the result 
;test multiply/divide 

rtest spill/fill 

r require 1024 registers 

r force trap 70 

; should return end time 

r save the result 



end of test.s 
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APPENDIX D: romdcl.h 



.eject 
. sbttl 



^Register, constant, and Macro Declarations' 



Copyright 1988, Advanced Micro Devices 
Written by Gibbons and Associates, Inc. 



Global registers 



reg 


rsp, grl 


equ 


SYS_TEMP, 64 


reg 


stO,gr64 


reg 


stl,gr65 


reg 


st2,gr66 


reg 


st3,gr67 


reg 


st4,gr68 


reg 


st5,gr69 


reg 


st6,gr70 


reg 


st7,gr71 


reg 


st8,gr72 


reg 


st9,gr73 


reg 


stl0,gr74 


reg 


stll,gr75 


reg 


stl2,gr76 


reg 


stl3,gr77 


reg 


stl4,gr78 


reg 


stl5,gr79 


eqfu 


SYS_STAT, 80 


reg 


ss0,gr80 


reg 


ssl,gr81 


reg 


ss2,gr82 


reg 


ss3,gr83 


reg 


ss4, gr84 


reg 


ss5,gr85 


reg 


ss6, gr86 


reg 


ss7,gr87 


reg 


ss8,gr88 


reg 


ss9,gr89 


reg 


sslO,gr90 


reg 


ssll,gr91 


reg 


ssl2,gr92 


reg 


ssl3,gr93 


reg 


ssl4,gr94 


reg 


ssl5,gr95 


equ 


RET_VAL,96 


reg 


vO,gr96 


reg 


vl,gr97 


reg 


v2,gr98 


reg 


v3,gr99 


reg 


v4,grlOO 


reg 


v5,grl01 


reg 


v6,grl02 


reg 


v7,grl03 


reg 


v8,grl04 


reg 


v9,grl05 


reg 


vlO,grl06 


reg 


vll,grl07 


reg 


vl2,grl08 


reg 


vl3,grl09 


reg 


vl4,grllO 



: local reg. var. stack pointer 
r system temp registers 



system static registers 



return registers 
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reg 


vl5, grill 


equ 


TEMP REG, 96 


reg 


t0,gr96 


reg 


tl,gr97 


reg 


t2,gr98 


reg 


t3,gr99 


reg 


t4,grlOO 


reg 


t5,grl01 


reg 


t6,grl02 


reg 


t7,grl03 


reg 


t8,grl04 


reg 


t9,grl05 


reg 


tl0,grl06 


reg 


tll,grl07 


reg 


tl2,grl08 


reg 


tl3,grl09 


reg 


tl4,grllO 


reg 


tl5, grill 


equ 


RES REG, 112 


reg 


rO,grll2 


reg 


rl,grll3 


reg 


r2,grll4 


reg 


r3,grll5 


equ 


TEMP EXT, 116 


re^ 


xO,grll6 


reg 


xl,grll7 


reg 


x2,grll8 


reg 


x3,grll9 


reg 


x4,grl20 


reg 


x5,grl21 


reg 


x6,grl22 


reg 


x7,grl23 


reg 


x8,grl24 



;temp registers 



/reserved (for user) 



;temp extension (and shared) 



Global registers with special calling convention uses 



reg 


tav,grl21 


reg 


tpc,grl22 


reg 


lsrp,grl23 


reg 


slp,grl24 


reg 


msp,grl25 


reg 


rab,grl26 


reg 


rfb,grl27 



;trap handler argument (also x6) 
rtrap handler return (also x7) 
r large return pointer (also x8) 
r static link pointer (also x9) 
[memory stack pointer 
; register alloc bound 
: register frame bound 



Local compiler registers - output parameters, etc. 
(only valid if frame has been established) 



reg 


pl5,lrl7 


reg 


pl4,lrl6 


reg 


pl3,lrl5 


reg 


pl2,lrl4 


reg 


pll,lrl3 


reg 


pl0,lrl2 


reg 


p9,lrll 


reg 


p8,lrl0 


reg 


p7,lr9 


reg 


p6,lr8 


reg 


p5,lr7 


reg 


p4,lr6 


reg 


p3,lr5 


reg 


p2,lr4 



r parameter registers 
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. reg 
. reg 



pl,lr3 
pO,lr2 



TLB register count 



. equ 
.eject 



TLB CNT,128 



constants for general use 



.equ WRD_SIZ,4 

.equ TRUE, 0x80000000 

.equ FALSE, 0x00000000 

.equ CHKPAT_a5, 0xa5a5a5a5 



constants for data access control 



.equ CE,Obl 

.equ CD,ObO 

.equ AS,OblOOOOOO 

.equ PA,0b0100000 

.equ SB,Ob0010000 

.equ UA,0b0001000 

.equ ROM_OPT,0bl00 

.equ DATA_OPT, ObOOO 

.equ INST_OPT, ObOOO 

.equ ROM_CTL, (PA + ROM_OPT) 

.equ DATA_CTL, (PA + DATA_OPT) 

.equ INST_CTL, (PA + INST_OPT) 

.equ 10 CTL, (AS + PA + DATA OPT) 



rword size 

r logical true — bit 31 
r logical false — 
r check pattern 



: co-processor enable 

r co-processor disable 

;set for I/O 

: set for physical ad 

:set for set BP 

:set for user access 

rOPT values for ace 



; control field 



.eject 
defined vectors 



22 



.equ 

• equ 
.equ 
.equ 
.equ 
.equ 
.equ 
.equ 

• equ 

• equ 

• equ 
.equ 
.equ 
.equ 
.equ 
.equ 
.equ 
.equ 
.equ 

• equ 

• equ 
.equ 

31 reserved 
.eqa 



V_IllegalOp,0 

V_Unaligned, 1 

V_0ut0fRange,2 

V_NoCoProc, 3 

V_CoProcExcept, 4 

V_ProtViol,5 

V_InstAccExcept, 6 

V_DataAccExcept, 7 

V_UserInstTLB,8 

V_UserDataTLB, 9 

V_SupInstTLB,10 

V_SupDataTLB, 11 

V_InstTLBProt,12 

V_DataTLBProt,13 

V_Timer, 14 

V_Trace,15 

V_INTR0,16 

V_INTR1,17 

V_INTR2,18 

V_INTR3,19 

V_TRAP0,20 

V_TRAP1,21 

V MULTIPLY, 32 
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• equ 


V_DIVIDE, 33 




.equ 


V_MULTIPLU,34 




.equ 


V_DIVIDU,35 




.equ 


V_CONVERT,3 6 


7 - 


- 41 reserved 






.equ 


V_FEQ,42 




.equ 


V_DEQ,43 




.equ 


V_FGT,4 4 




.equ 


V_DGT,45 




. equ 


V_FGE,4 6 




.equ 


V_DGE,4 7 




.equ 


V_FADD,48 




.equ 


V_DADD, 4 9 




.equ 


V_FSUB,50 




.equ 


V_DSUB,51 




.equ 


V_FMUL, 52 




.equ 


V_DMUL, 53 




.equ 


V_FDIV, 54 




.equ 


V_DDIV, 55 


6 - 


- 63 reserved 






.equ 


V_SPILL, 64 




.equ 


V_FILL, 65 




.equ 


V_BSDCALL,66 




.equ 


V_SYSVCALL, 67 




.equ 


V_BRKPNT, 68 




.equ 


V_EPI_OS, 69 




.eject 






.macro 


R LEFT,REGVAR 



Rotate left 

Parameters: 

add 

addc 

.endm 



REGVAR, REGVAR, REGVAR 
REGVAR,REGVAR,0 



register to rotate 

; shift left by 1 bit,C = MSB 
;add C to LSB 



. macro FUNCTION , NAME , INCNT , LOCCNT , OUTCNT 

Introduces a non-leaf routine. 

This macro defines the standard tag word before the function, 
then establishes the statement label with the function's name 
and finally allocates a register stack frame. It may not be used 
if a memory stack frame is required. 

Note also that the size of the register stack frame is limited. 
Neither this nor the lack of a memory frame is considered to be 
a severe restriction in an assembly-language environment. The 
assembler will report errors if the requested frame is too large 
for this macro. 

It may be good practice to allocate an even number of both output 
registers and local registers. This will help in maintaining 
double word alignment within these groups . The macro will assure 
double word alignment of the stack frame as a whole, as required 
for correct linkage. 
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Paramters: 



.set 

. set 

. set 

.if 

.set 

.endif 

.if 

. set 

.endif 

.if 

.set 

.endif 

.word 



NAME 
INCNT 
LOCCNT 
OUTCNT 



the function name 
input parameter count 
local register count 
output parameter count 



ALLOC_CNT, ( (2 + OUTCNT + LOCCNT) « 2) 
PAD_CNT, (ALLOC_CNT & 4) 
ALLOC_CNT, (ALLOC_CNT + PAD_CNT) 
(INCNT) 
IN_PRM, (4 + OUTCNT + PAD_CNT + LOCCNT + 0x80) 

(LOCCNT) 

LOC_REG, (2 + OUTCNT + PAD_CNT + 0x80) 

(OUTCNT) 
OUT_PRM, (2 + 0x80) 

((2 + OUTCNT + LOCCNT) « 16) 



sub 
asgeu 
add 
.endm 



rsp,rsp,ALLOC_CNT 

V_SPILL,rsp,rab 

lrl,rsp, ((4 + OUTCNT + LOCCNT + INCNT) « 2) 



. ma c r o LEAF , NAME , I NCNT 

Introduces a leaf routine 

This macro defines the standard tag word before the function, 
then establishes the statement label with the function's name. 



ramters: 


NAME 




INCNT 


.if 


(INCNT) 


.set 


IN PRM, (2 + 0x80) 


.endif 




.set 


ALLOC CNT,0 


.word 






the function name 
input parameter count 



EPILOGUE 



Deallocates register stack frame (only and only if necessary) 



.if 


(ALLOC CNT) 


add 


rsp, rsp, ALLOC_CNT 


nop 




jmpi 


IrO 


asleu 


V FILL,lrl,rfb 


.else 




jmpi 


IrO 


nop 




.endif 




.set 


IN PRM, (1024) 


.set 


LOC REG, (1024) 



/illegal, to cause err on ref 
/illegal, to cause err on ref 
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.set 
. set 
endm 



OUT_PRM, (1024) 
ALLOC CNT, (1024) 



/illegal, to cause err on ref 
/•illegal, to cause err on ref 



Initial values for macro set variables to guard against misuse 

.set IN_PRM, (1024) /illegal, to cause err on ref 

.set LOC_REG, (1024) /illegal, to cause err on ref 

.set OUT_PRM, (1024) /illegal, to cause err on ref 

.set ALLOC_CNT, (1024) /illegal, to cause err on ref 



end of romdcl.h 
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APPENDIX E: test.ld 

test. Id Linker Directives 

see test. 3 and start. s for descriptions of sections 

load test .o, start .o 

order vectors=0, rstack, mstack, .bss, .data, .text, endsect 
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Host Interface (HIF) v1.0 Specification 
Application Note 



by E. M. Greenawalt 



P 



PREFACE 

This document describes HIF (v1 .0), the Am29000 Ar- 
chitectural Host Interface, and explains how to use it. 
HIF is the software standard that defines the interface 
between the user's high-level language program and 
the Am29000 processor. The document is written for 
experienced programmers and assumes a working 
knowledge of the Am29000 microprocessor. 

INTRODUCTION 

Advanced Micro Devices is developing a complete line 
of Am29000TM simulators, hardware target execution 
vehicles, and high-level language development tools for 
the Am29000 32-bit Streamlined Instruction Processor. 
These products are designed to support end-users who 
are building embedded system applications based on 
the Am29000 processor. For these users, often there is 
no existing operating system or kernel for their hardware 
design. 

Before AMD could create development tools for the 
Am29000, a standard set of kerne! services had to be 
defined that would interface a user-application program. 



written in a high-level language, to a host operating sys- 
tem or an Am29000 processor. 

HIF, the host interface, is the software specification that 
defines this standard set of kernel services. Figure 
NO TAG shows the level where HIF resides. As implied 
by the figure, HIF does not describe any particular im- 
plementation; but rather each simulator, hardware vehi- 
cle, and high-level language implements HIF in its own 
way. The kernel services provide the minimum function- 
ality needed to interface high-level language library 
functions to the user's operating system code. 

Using HIF, program modules written in any of the lan- 
guages available for the Am29000 can be combined, 
and the resulting program can run, without change, on 
any Am29000 simulator or hardware execution vehicle. 
Future AMD products will also use HIF, and AMD is 
actively encouraging third-party vendor support. 

AMD is indebted to Embedded Performance, Incorpo- 
rated (EPl), who originally developed the HIF concepts 
and then graciously placed them in the public domain. 
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HIF APPLICATIONS 

The HIP specification has broad applications; currently it 
provides the interface between the user's high-level 
language program and the following hardware and 
software products: 

• Am29000 Architectural Simulator. This software prod- 
uct provides the means to simulate the operation of 
the Am29000 in a specified system environment. It 
provides detailed performance statistics by modeling 
the internal architecture of the Am29000, as well as 
system memory configurations and timing. The HIF 
specification is implemented to provide the interface 
between the user's program and the host operating 
system. 

• PC Execution Board (PC EB29K ''**}. This hardware/ 
software product contains an Am29000 processor 
and memory and is an add-in board to IBI\^® 
PC-based systems. Part of the HIF specification is 
implemented on the board with another part imple- 
mented on the PC, to interface with the DOS operating 
system. 

• Standalone Execution Board (STEB). This hardware 
product from STEP Engineering is intended to be an 
evaluation vehicle for the Am29000 and, optionally, 
Am29027™ Arithmetic Accelerator devices. The en- 
lire HIF specification is implemented on this board, 
which contains a resident monitor to implement the 
necessary kernel services. 

Because HIF is a general-purpose standard, it can be 
used to interface any high-level language to the 
Am29000. User programs need not be written entirely in 
a high-level language; they may incorporate assembly- 
language functions when maximized performance is the 
primary concern. 

HIF USERS 

There are three categories of end-users who need to 
know the details of the host interface: 

• Those using AMD-supplied hardware execution vehi- 
cles or simulators. This document defines the low- 
level mechanisms of HIF. With this information and 
the design concepts presented herein, end-users can 
extend the HIF environment to meet the needed 
degree of software functionality and sophistication. 

• Those developing a custom kernel operating system 
for an Am29000 design. These users need access to 
AMD's high-level and assembly-language develop- 
ment tools. This document provides the information 
required to build a HIF-conforming kernel that uses 
the high-level language development tools directly. 
With this information, end-users can extend and 



customize the operating system code without interfer- 
ing with the basic capabilities of the HIF. 

• Those who are using the AMD-supplied high-level 
language development tools, but who must conform 
to another kernel operating system interface. There is 
sufficient information in this document to enable users 
to nxjdify the development tools to properly interface 
with the target kernel's specifications. 

HIF CONCEPTS 

Programmers developing software in a high-level 
language do not work directly with the processor. 
Instead, they think in terms of a virtual machine ideally 
suited to the computational paradigm of the language. 
For instance, the C-language virtual machine has 
operations such as fprintf() and strcpyO, and the 
FORTRAN machine has operations such as alog and 
sqrt. 

In actual practice, these virtual machines are imple- 
mented by libraries of object code that perform 
language-specific operations. As long as programmers 
use only the functions of the language's implied virtual 
machine, the programs will be portable across a broad 
range of implementations of the language. 

However, computer systems generally provide another 
virtual machine to the world: one that is defined by the 
operating system software. This virtual machine 
requires system calls to perform the services that are 
implemented within the operating system code. Typical 
services are: process management, file system 
management, device management, and memory 
management. 

The high-level language virtual machine usually 
consists of: (1) functions that can be implemented 
entirely within library routines, and (2) functions that 
require the services of the operating system. The func- 
tions of the first group (usually defined as the standard 
library for that language) are independent of the operat- 
ing system virtual machine on which they are imple- 
mented. The functions of the second group must be 
coded in terms of the operating system virtual machine. 
In other words, they must make system calls. 

It is often useful for end-users to also make system calls, 
even though this practice makes their programs less 
portable. This requirement can be accommodated by 
augmenting the language library with glue routines that 
specifically invoke the system calls, while providing the 
end user with suitable high-level syntax and semantics. 
(For detailed information on the glue routines for the 
C compiler, see the HighC29K Reference Manual, 
"Appendix A, Host Interface Definition.") 
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Given the above discussion, Ifie required tasl< is to cre- 
ate high-level language development tools that can be 
used easily and efficiently on a variety of execution vehi- 
cles. This task can be broken down into the following 
steps: 

• Define an operating system virtual machine that 
provides sufficient functionality to support the funda- 
mental requirements of each high-level language, but 
not so much as to require a massive development 
effort to create. 

• Add appropriate glue routines to the standard libraries 
of the language so that the libraries are defined in 
terms of the operating system virtual machine. 

• Implement the operating system's virtual machine 
services on the various execution vehicles. For 
hardware vehicles, the virtual machine is imple- 
mented by a kernel, typically contained in a resident 
monitor software program. For simulation vehicles, 
the virtual machine is implemented by code internal to 
the simulator and by code simulated by the simulator. 

For the Am29000 hardware and software support prod- 
ucts, HIF consists of the following operating system 
virtual machine definitions: 

• A carefully defined, efficient system call mechanism. 
Accessing an HIF kernel service requires a transition 
from user mode to supervisor mode on the processor. 
This requires a specific mechanism, such as a trap 
handler, to be invoked. 

• A set of services that support the primitive require- 
ments of C, FORTRAN, and Pascal. Most of the 
services are defined according to UNIX® operating 
system interface specifications. 

• A specification of the environment created by the 
kernel. This involves the definition of storage alloca- 
tion and register initializations implemented by the 
kernel. 



IMPLEMENTATION TYPES 

Implementations of the HIF specification take two fun- 
damental forms: self-hosted and embedded. Examples 
of each of these are provided in the Standalone Execu- 
tion Board (STEB) manufactured by STEP Engineering 
and AMD's PC Execution Board {PCEB29K). 

The STEB is a single-board computer that incorporates 
an Am29000 processor, an optional Am29027 arithme- 
tic accelerator, program and data menrwry, serial ports, 
and timer-counter resources. The HIF implementation 
for this board consists of a resident monitor program that 
is downloaded into low-memory locations, and which 
implements the kernel services described in the "HIF 
Service Routine" section of this document. This is a self- 
hosted implementation. 

In contrast to the STEB, the PCEB29K is an add-in 
board for IBM PC-compatible computers that incorpo- 
rates an Am29000 processor, program and data mem- 
ory, serial ports, and timer-counter resources. The HIF 
implementation for this board consists of two portions of 
code. One performs some of the kernel services on the 
board and the other performs some of the kernel sen/- 
ices through the auspices of the DOS operating system. 
In the sense that the HIF is grafted onto the existing host 
operating system, it is called an embedded implementa- 
tion. The architectural and instruction simulators are 
also embedded implementations because they share 
the HIF implementation between custom code and 
existing host-computer operating-system code. 

There is no preference for either type of implementation 
as long as the services and features of the HIF specifica- 
tion are fully implemented in the target environment. 
With the standard interfaces that a HIF implementation 
presents, application programs written for one environ- 
ment will run equally well in another. 

HIF SERVICES PREVIEW 

Table 1 lists the services defined by the HIF interface. 
Most are similar or identical to equivalent UNIX operat- 
ing system calls. The titles given in column one are not 
the names that actually exist in a particular library but, 
instead, are the generic names of the services, for the 
purpose of this overview. 
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Table 1.HIF Services 



Name 


Description 


clock 


Returns the elapsed processor time, in milliseconds 


close 


Closes a file 


cycles 


Returns processor cycle counts 


exit 


Terminates a program 


getargs 


Returns an argument address 


getenv 


Gets the environment 


getpsize 


Returns the memory page size 


Iseek 


Sets a file position 


open 


Opens a file 


read 


Reads a buffer of data from a file 


remove 


Removes (deletes) a file 


rename 


Renames a file 


sysalloc 


Allocates memory space 


sysfree 


Frees allocated memory space 


setvec 


Sets user trap addresses 


time 


Returns number of seconds since Jan. 1 , 1 970 


tmpname 


Returns a temporary file name 


write 


Writes a buffer of data to a file 
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INTENDED AUDIENCE 

This document is intended for systems designers and 
programmers who have a working knowledge of the 
Am29000 and its supporting peripheral hardware. It 
does not cover CPU design, the Am29000 instruction 
set, or any other hardware detail. Those topics are 
adequately covered in the reference documents listed 
below. 

ABOUT THIS DOCUMENT 

The contents of each section and appendix of this 
document are described below: 

Section 1 : Introduction — discusses the important 
concepts underlying the host interface 
definition and previews the services that 
form the basis of the HI F specification. 

Section 2: System Call Mechanism— describes the 
mechanism used to make calls on the 
HIF services, and includes information 
on register usage for passing parame- 
ters and receiving results. 

Section 3: Service Routine Descriptions — de- 
scribes each of the services defined in 
HIF and shows details of the code 
sequences, including examples, for in- 
voking the services. 

Section 4: Process Environment — describes the 
standard memory allocation and register 



initializations performed by the HIF- 
conforming kernel prior to execution of a 
user program. 

Appendix A: HIF Quick Reference— lists all of the 
services and service parameters used in 
this document, in quick reference form. 

Appendix B: Error Messages — lists the error codes 
that HIF-conforming services may 
return. 

REFERENCE DOCUMENTS 

The user should have access to the following AMD 
documents: 

• Am29000 Streamlined Instruction Processor Users 
Manual, order m0620 

• ADAPT29K User's Manual 

• MON29K User's Manual 

• MON29K Installation and Customization Manual 

• Am29000 Execution Board and Monitor User's 
Manual 

• ASM29K Utilities Manual from the ASM29K docu- 
mentation set 

• HighC29K Reference Manual from the HighC29K 
documentation set 
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DOCUMENTATION CONVENTIONS 

This specification assumes some familiarity with the 
UNIX operating system and the language. In the fol- 
lowing sections, the conventions presented in the sub- 
sections below are assumed. 

Numeric Values 

All numeric values are presumed to be expressed in 
decimal notation, unless othenwise stated. Hexadecimal 
values are prefaced by the characters "Ox." Any value 
not prefaced by "Ox" is defined to be a decimal number. 
For example: 



100092 
0x100092 



Decimal number 
Hexadecimal number 



The first number, above, is a decimal value by impli- 
cation, because it has not been prefaced by "Ox." The 
second constant includes the explicit "Ox" prefix, desig- 
nating it as a hexadecimal value. 

Character Strings 

In the documentation, frequent mention is made of char- 
acter strings that hold file names, path names, and en- 
vironment variable names. In all cases, the HIF 
Specification requires that strings be constructed as a 
sequence of ASCII characters terminated by a NULL 
byte (an 8-bit character composed of all zero bits). This 
is the fom in which strings are represented in the C 
language. Thus, the space reserved for a string must be 
one byte longer than the length of the string, to accom- 
modate the NULL byte. 

Languages such as Pascal, which require "counted" 
strings (that is, a single 8-bit byte in the first character of 
the string that specifies the number of bytes that follow), 
are required to convert these to NULL-terminated form 
before calling the HIF kernel services. In addition, 
languages other than C may need to convert strings 
passed back from the HIF kemel services to a com- 
patible internal form. All returned strings are in NULL- 
terminated form. 



SYSTEM CALL MECHANISM 

System calls on Am29000-based systems are accom- 
plished through invocation of a specific software trap. 
The Am29000 traps are roughly equivalent to software 
interrupts on other CPUs. System call traps are invoked 
through execution of an appropriate assert instruction 
whose assertion is FALSE at the lime the instruction is 
executed. 

Execution of an ASEQ, ASGE, ASGEU, ASGT, 
ASGTU. ASLE, ASLEU. ASLT, ASLTU, or ASNEQ 



instruction, where the result of the assertion is FALSE, 
will cause the trap specified in the instruction to be 
taken. 

Once the trap is invoked, the Am29000 accesses a trap 
vector containing up to 256 separate trap handler 
addresses; or it may directly invoke a trap handler rou- 
tine, depending on the implementation of the operating 
system trapping mechanism and the state of the Vector 
Fetch (VF) bit in the processor's Configuration Register. 
In most implementations, a table of vectors is used. 
However, the operating system software may imple- 
ment direct trap execution for the increased efficiency it 
offers even though it requires the reservation of a much 
greater amount of system memory, but bypasses the 
need for vector table lookup. 

When a trap is taken, the normal program execution 
sequence is intermpted and the trap handler is invoked. 
At this point, the current program's context is contained 
in Am29000 CPU registers. No saving or restoring of 
registers is performed by the processor when a trap 
occurs. HIF services are required to preserve the 
following registers and restore their contents before 
returning to the application program: 

• All local registers 

• Global registers gr1, grl 12, gr1 15, and gr125 

• Global registers gr126and gr127shou\d be preserved 
according to AMD calling conventions. Their values 
may differ upon return from a H I F service, but the span 
between their values will remain the same. 

The HIF services may modify the contents of certain 
registers without first saving their values, namely: 
gr121, gr96, and gr97; although, the application pro- 
gram should not count on gr96\hrough grl 11 {obe un- 
touched by current and future HIF kernel services. 

HIF SERVICE INVOCATION 

Before invoking a HIF service, the service number and 
any input parameters to be passed must be loaded into 
Am29000 general registers. Both local and global regis- 
ters are used for various HIF services, as shown in the 
HIF Quick Reference table in Appendix A of this docu- 
ment. Details for invoking specific services are con- 
tained in the Service Routine Descriptions section. 

Service Number 

Every HIF system service is identified by a unique num- 
ber. Service numbers 0-127 and 256-383 are 
reserved for use by AMD and should not be used for 
user-supplied extensions. 
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const 


lr2, input_f ile 


set input file 


consth 


lr2, input_f ile 


pathname address 


const 


lr3,0 RDONLY 


set open mode 


const 


grl21,17 


service number = 17 (open) 


asneq 


69,grl,grl 


force trap 69 (system call) 


const 


lr2, input_f ile 


set input file 


consth 


lr2, input_f ile 


pathname address 


const 


lr3,0 RDONLY 


set open mode 


const 


grl21,17 


service number = 17 (open) 


asneq 


69,grl,grl 


force trap 69 (system call) 



The service number must be loaded into global register 
gr121, the trap-handier argument register. Gr121 is a 
temporary register and its value is not preserved over a 
system call, nor will its value be preserved over any trap 
invoked by the running program. 

Input Parameters 

Any input parameters to be passed must be placed in 
local registers /r2 through Ir17. Input parameters are 
passed to HIF services using the parameter passing 
mechanism specified in the Am29000 calling conven- 
tions documentation {Am29000 Streamlined Instruction 
Processor User's Manual, order #1 0620). 

Invoking a HIF Service 

The HIF services are accessed by forcing trap 69 to 
occur, after the service number and parameters (if any) 
are loaded in the designated registers. Trap handler 69 
executes the service in supervisor mode. 

Returned Values 

Most services return values, usually a single integer 
value (number of bytes read or written, number of clock 
ticks, size of a memory block, etc.), or a pointer (address 
of a file descriptor, address of a memory block, etc.). 
These values are returned in reg\s\er gr96, per standard 
high-level language calling conventions. 

If a service returns multiple values, the additional values 
are returned in gr97, gr98, and so forth. If the service 
fails to perform the requested task, the values contained 
in gr96 and succeeding registers are not guaranteed to 
be valid. 

See the documentation that accompanies your 
language processor for additional details on Am29000 
high-level language calling conventions. 

Status Reporting 

In all cases, upon return from a HIF sen/ice, global regis- 
ter gr121 contains either a TRUE value (0x80000000), 
or a positive non-zero integer error code indicating the 
reason for failure. Pre-defined error codes are listed in 



Appendix B of this document for existing HIF implemen- 
tations. 

HIF does not specify these error codes. They may be 
completely defined by an implementation, except for 
cases in which there is a corresponding, existing, UNIX 
error code. In these cases, the UNIX error code is 
expected to be used. 

Example Assembly Code 

The code fragment above shows how the definitions are 
implemented in Am29000 assembly-language to invoke 
the open HIF service to open a file: 

In this example, local register Ir2 is loaded with the 
address of the filename constant; local register Ir3 
contains the code: 0_RDONLY, indicating that the file is 
to be opened for read-only access. The service number 
(17) is loaded into global register gr121 and the sen/ice 
is executed by asserting that register grf is not equal to 
itself. Since this is FALSE, the trap is invoked. 

USER-IVIODE TRAPS 

When a trap is invoked, the Am29000 switches from 
user mode to supervisor mode to execute the trap 
handler code. Most traps are properly executed in this 
mode, including the kernel services that implement the 
HIF specification. However, a few traps, such as the 
spill/fill handlers, are intended to execute in user mode. 
In these cases, the trap handler code is not part of the 
kernel, but is supplied by the particular high-level 
language product library and is linked with the user's 
application program. 

In order to use a consistent trap handling mechanism, 
and to support the individual language products' meth- 
odologies for user-mode traps, a HIF service called 
setvec, is called with the address of the user-mode trap 
handler code for each of the traps handled in this way. 

Once the user-mode handler addresses have been sup- 
plied, and the corresponding trap is invoked, the operat- 
ing-system kernel receives control in supervisor mode. 
It then reinstates user mode and invokes the appropri- 
ate language library trap handler to complete the 
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required operation. Tiiis bouncing from user mode to 
supervisor mode and bacl<to user mode is referred to as 
a 'Irampoline" effect. Wlien ttie trap handler's execution 
is complete, it returns directly to tiie user's application 
program, rather than back through the kernel. 

The register stack spill/fill handlers are appropriate 
examples of code that is intended to execute in user 
mode. When a user's application program calls a func- 
tion that requires a large number of local registers to 
execute, some currently unused registers may have to 
be written to main memory to free enough of the on-chip 
registers. In this case, the registers are spilled to mem- 
ory via the spill-trap handler. When the function 
completes execution and intends to return to its caller, 
the spilled registers may have to be restored by calling 



the fill-trap handler. Since register stack management is 
unique for each applicatio n environment, individual spill/ 
fill handlers are provided with each of the high-level 
language products. 



HIF SERVICE ROUTINES 

The HIF service routine calls currently defined are listed 
by decimal service number in Table 2 below and 
described in detail in the following pages. 

Service numbers through 127 and 256 through 383 
are reserved by AMD and should not be used for user- 
supplied extensions. Table 3 describes the parameter 
names used in the service descriptions. 



Table 2. HIF Service Calls 



Number 



Title 



Description 



Page 



1 


exit 


Terminate a program 


17 


open 


Open a file 


18 


close 


Close a file 


19 


read 


Read a buffer of data from a file 


20 


write 


Write a buffer of data to a file 


21 


Iseek 


Seek file byte 


22 


remove 


Remove a file 


23 


rename 


Rename a file 


33 


tmpnam 


Return a temporary name 


49 


time 


Return seconds 


65 


getenv 


Get environment 


257 


sysalloc 


Allocate memory space 


258 


sysfree 


Free memory space 


259 


getpsize 


Return memory page size 


260 


getargs 


Return base address 


273 


clock 


Return milliseconds 


274 


cycles 


Return processor cycles 


289 


setvec 


Set user trap address 



10 

11 

14 
15 
16 
17 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 
30 
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Table 3. Service Call Parameters 



Parameter Description 



addrptr A pointer to an allocated memory area, command-line-argument array, pathname buffer, or NULL- 

terminated environment variable name string, 
baseaddr The base address of command-line-argument vector. 

buffptr A pointer to buffer area which data is to be read from or written to during the execution of I/O services, 

count Thenumberof bytes actually read from a file or written to a file, 

cycles The number of processor cycles returned, 

errcode The error code returned by the service, usually the same as the codes returned in the UNIX variable 

errno. See Appendix B, Table 8, starting at page 35, for a list of HIF error codes, 
exitcode The exit code of the application program. 

filename A pointer to a NULL-terminated ASCII string containing the directory path of a temporary filename, 

fileno The file descriptor, a small integer number. Descriptors 0, 1 , and 2 are guaranteed to exist and 

correspond to open files on program entry (0 is UNIX equivalent of stdin and is opened for input, 1 is 

UNIX stdout and is opened for output, 2 is UNIX stderr and is opened for output). The fileno is 

returned when an open call is successful, 
funaddr A pointer to the address of a service. 

mode A series of option flags whose values represent the operation to be performed, 

msecs Milliseconds. 

name A pointer to a NULL-terminated ASCII string that contains an environment variable name, 

nbytes The number of data bytes requested to be read from or written to a file, or number of bytes to allocate 

from the heap, 
newfile A pointer to a NULL-terminated ASCII string that contains the directory path of a new filename, 

offset The number of bytes from a specified position {orig) in a file. 

oldfile A pointer to NULL-terminated ASCII string that contains the directory path of the old filename, 

orig A value of 0, 1 , or 2 that refers to the beginning, current position, or the position of the end of a file, 

pagesize The memory page size in bytes returned. 

pathname A pointer to a NULL-terminated ASCII string that contains the directory path of a filename, 

pflag The UNIX file access permission codes, 

retval The return value that indicates success or failure, 

sees The seconds count returned, 

trapno The trap number, 

where The current position in a specified file. 



Each service description on the pages that follow 
contains a concise explanation of the purpose of the 
service, the input and result register contents, and 
example assembly-language code to invoke the serv- 
ice. In all cases, operating system kernel services that 
meet the HIF specifications are invoked by forcing the 
software trap 69 to occur. The service number is always 
contained in general register gr121 and parameters 
are passed, if necessary, in local registers, beginning 
with Ir2. 



HIF implementations are required to return an error 
code when a requested operation is not possible. The 
codes from to 255 are reserved for compatibility with 
current and future error return standards. The currently 
assigned codes and their meanings are listed in Appen- 
dix B, Table 8, starting on page 35. If a HIF implementa- 
tion returns an error code in the range of to 255, it 
must carry the identical meaning to the corresponding 
error code in this table. Error code values larger than 
255 are available for implementation-specific errors. 



When the service returns, general register gr121 is 
required to report the success or failure of the service. If 
successful, gr121 is expected to contain a TRUE 
boolean value (a 1 bit in the most significant bit position). 
If the service is not successful, a positive non-zero error 
code is returned in gr121. If the service returns results, 
the first result is held in gr96, the second in gr97, and so 
forth. 



In the examples, references are made to error handlers 
that are not part of the example code. These are 
assumed to be contained in the larger part of the user's 
code and are not supplied as part of the HIF specifica- 
tion. The JMPF instructions have been provided to show 
that interface glue routines should incorporate this error 
testing philosophy in orderto be robust. In practice, error 
handling may be relegated to a single routine, or may be 
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vested in individual sections of either in-line code, or as 
callable services by the glue routines. 

Since HIF implementations may exist over a wide spec- 
trum of systems, the capabilities of the HIF may vary 
from one system to the next. In the simplest case, the 
HIF implementation in an embedded Am29000 system, 
such as a printer controller, may contain no external file 
system. In this event, the input/output facilities specified 
in the kernel service descriptions need not be imple- 
mented. In more common cases, where the HIF will ex- 
ist on systems that have full operating system 
capabilities, such as DOS or UNIX, it is assumed that all 



of the features of the HIFwiil be implemented. The serv- 
ice descriptions in this document provide a set of stan- 
dard interfaces for comnrwnly implemented operating 
system interfaces. If individual features are imple- 
mented, the interfaces are expected to follow the guide- 
lines in this specification. 

Descriptions of the individual services follow on the 
remaining pages of this section. They are listed in 
numeric sequence by sen/ice number. Appendix A, HIF 
Quick Reference, allows easy location of a service by its 
number. 
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Service 1 — exit 

Description 

This service terminates thie current program and returns 
a value to thie system kernel, indicating tlie reason for 
termination. By convention, a zero passed in Ir2 
indicates normal termination, while any non-zero value 

Register Usage 



Terminate a Program 



indicates an abnormal termination condition. There are 
no returned values in registers gr96and gr121 since this 
service does not return. 



Type 



Regs 



Contents 



Description 



Calling: 



Returns: 



gr121 


1 (0x1) 


Ir2 


exitcode 


gr96 


undefined 


gr121 


undefined 



Service number 
User-supplied exit code 

This service call does not return 
This service call does not return 



Example Call 



const 


lr2, 1 


; exit code = 1 


const 


grl21,l 


; service = 1 


asneq 


69,grl,grl 


; call the operating system 



In the above example, the operating system kernel is 
being called with service code 1 and an exit code of 1 , 
which is interpreted according to the specifications of 
the individual operating system. The value of the exit 
code is not defined as part of the HIF specification. 

In general, however, an exit code of zero (0) specifies a 
normal program termination condition, while a non-zero 



code specifies an abnormal termination resulting from 
detection of an error condition within the program. 

Programs can terminate normally by falling through the 
curly brace at the end of the main function in a 
C-language program. Other languages may require an 
explicit call to the kernel's exit service. 
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Service 17— open 

Description 

This service opens a named file in a requested mode. 
Files must be explicitly opened before any read, write, 
close, or other file positioning accesses can be accom- 
plished. The open service, if successful, returns an 

Register Usage 



Open a File 



integer token that is used to refer to the file in all subse- 
quent service requests. In many high-level languages, 
the returned token is referred to as a '1iie descriptor." 



Type 



Regs 



Contents 



Description 



Calling: 



Returns: 



gr121 


17(0x11) 


Ir2 


pathname 


Ir3 


mode 


Ir4 


pflag 


gr96 


fileno 


gr121 


0x80000000 




errcode 



Service number 

A pointer to a filename 

See parameter descriptions below 

See parameter descriptions below 

Success: > (file descriptor) 
Failure: < 

Logical TRUE, service successful 
Error number, service not successful 
(implementation dependent) 



Parameter Descriptions 

Pathname is a pointer to a zero-terminated string that 
contains the full path and name of the file being 
opened.* Individual operating systems have different 
means to specify this information. With hierarchical file 
systems, individual directory levels are separated with 
special characters that can not be part of a valid file- 
name or directory name. In UNIX-compatible file 
systems, directory names are separated by fonward 
slash characters 7" (e.g., 7usr/jack/files/myfile"); where 



"usr," "jack," and iiles" are succeedingly lower directory 
levels, beginning at the root directory of the file system. 
The name "myf ile" is the filename to be opened at the 
specified level. The individual characteristics of files and 
pathnames are determined by the specifications of a 
particular operating system implementation. 

Mode is composed of a set of flags, whose mnemonics 
and associated values are listed in Table 4. 



Table 4. Open Service Parameters 



Name 



Value 



Description 



0_RDONLY 


0x0000 


Open for read only access 


0_WRONLY 


0x0001 


Open for write only access 


0_RDWR 


0x0002 


Open for read and write access 


0_APPEND 


0x0008 


Always append when writing 


NDELAY 


0x0010 


No delay 


GREAT 


0x0200 


Create file if it does not exist 


TRUNC 


0x0400 


If the file exists, truncate it to zero length 


0_EXCL 


0x0800 


Fail if writing and the file exists 


0_FORM 


0x4000 


Open in text format 



The 0_RDONLY mode provides the means to open a 
file and guarantee that subsequent accesses to that file 
will be limited to read operations. The operating system 
implementation will determine how errors are reported 



for unauthorized operations. The file pointer is 
positioned at the beginning of the file, unless the 
O APPEND mode is also selected. 



The HIF specification intentionally refrains from defining the constituents of a legal pathname, or any intrinsic characteristics of the implemented 
file system. In this regard, the only requirement of a HlF-conforming kernel is that when the open service is successfully performed, that the 
routine returns a small integer value that can be used In subsequent input/output service calls to refer to the opened file. 



3-173 



29K Family Application Notes 



The 0_WRONLY mode provides the means to open a 
file and guarantee that subsequent accesses to that file 
will be limited to write operations. The operating system 
implementation will determine how errors are reported 
for unauthorized operations. The file pointer is 
positioned at the beginning of the file, unless the 
0_APPEND mode is also selected. 

The 0_RDWR mode provides the means to open a file 
for subsequent read and write accesses. The file 
pointer is positioned at the beginning of the file, unless 
the 0_APPEND mode is also selected. 

If 0_APPEND mode is selected, the file pointer is 
positioned to the end of the file at the conclusion of a 
successful open operation, so that data written to the 
file is added following the existing file contents. 

Ordinarily, a file must already exist in order to be 
opened. If the 0_CREATmode is selected, files that do 
not currently exist are created; otherwise, the open 
function will return an error condition in gr121. 

If a file being opened already exists and the 0_TRUNC 
mode is selected, the original contents of the file are dis- 
carded and the file pointer is placed at the beginning of 
the (empty) file. If the file does not already exist, the HIF 
service routine should return an error value in gr121, 
unless 0_CREAT mode is also selected. 

The 0_EXCL mode provides a method for refusing to 
open the file if the 0_WRONLY or 0_RDWR modes are 
selected and the file already exists. In this case, the 
kernel service routine should return an error code in 
gr121. 

0_FORM mode indicates that the file is to be opened as 
a text file, rather than a binary file. The nominal standard 
input, output, and error files (file descriptors 0, 1 , and 2) 
are assumed to be open in text mode priorto commenc- 
ing execution of the user's program. 



When opening a FIFO (interprocess communication 
file) with 0_RDONLY or 0_WRONLY set, the following 
conditions apply: 

• If 0_NDELAY is set (i.e., equal to 0x0010): 

- A read-only open will return without delay. 

- A write-only open will return an error if no process 
currently has the file open for reading. 

• If 0_NDELAY is clear (i.e., equal to 0x0000): 

- A read-only open will block until a process opens a 
file for writing. 

- A write-only open will block until a process opens a 
file for reading. 

When opening a file associated with a communication 
line (e.g., a remote modem or terminal connection), the 
following conditions apply: 

• If 0_NDELAY is set, the open will return without 
waiting for the carrier detect condition to be TRUE. 

• If 0_NDELAY is clear, the open will block until the 
carrier is found to be present. 

The optional pflag parameter specifies the access 
permissions associated with a file; it is only required 
when 0_CREAT is also specified (i.e., create a new file 
if it does not already exist). If the file already exists, pflag 
is ignored. This parameter specifies UNIX-style file 
access permission codes (r, iv, and xfor read, write, and 
execute, respectively) for the file's owner, the wot1< 
group, and other users. If the parameter is missing, pflag 
will be set to -1 (all accesses allowed). See the UNIX 
operating system documentation for additional 
information on this topic. 
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Example Call 



path: 



fd: 



.ascii 


"/usr/^ack/fil* 


2S 


'm] 


^file\0" 


.set 


mode , 0_RDWR | 0_CREAT | 0_FORM 


.set 


permit, 0x180 








.word 











const 


lr2,path 






address of pathname 


consth 


lr2,path 








const 


lr3,mode 






open mode settings 


const 


lr4, permit 






permissions 


const 


grl21,17 






service = 17 (open) 


asneq 


69,grl,grl 






perform OS call 


jmpf 


grl21, open_err 






jump if error on open 


const 


grl20,fd 






set address of 


consth 


grl20,fd 






file descriptor 


store 


0,0,gr96,grl20 






store file descriptor 



In the atx)ve example, the file is being opened in read/ 
write text mode. The UNIX permissions of the owner are 
set to allow reading and writing, but not execution, and 
all other permissions are denied. As indicated above in 
the parameterdescriptions, the file permissions are only 
used if the file does not already exist. When the open 
sen/ice retums, the program jumps to the open_err 
error handler if the open was not successful; othenwise, 
the file descriptor returned by the service is stored for 
future use in read, write, Iseeic, remove, rename, or 
close sen/ice calls. 



As described in the introduction to these services, the 
HIF can be implemented to several degrees of elabora- 
tion, depending on the underlying system hardware, 
and whether the operating system is able to provide the 
full set of kernel sen/ices. In the least capable instance 
(i.e. , a standalone board with a serial port) , it is likely that 
only the 0_RDONLY, 0_WRONLY and 0_RDWR 
modes will be supported. In more capable systems, the 
additional modes should be implemented, if possible. 
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Service 18 — close 



Close a File 



Description 

This service closes the open file associated with the file 
descriptor passed in Ir2. Closing all files is automatic on 
program exit (see exit), but since there is an implemen- 
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tation-def ined limit on the number of open files per pro- 
cess, an explicit close service call is necessary for 
programs that deal with many files. 



Type 



Regs 



Contents 



Description 



Calling: 



Returns: 



gr121 


18(0x12) 


Service number 


Ir2 


fileno 


File descriptor 


gr96 


retval 


Success: = 
Failure: < 


gr121 


0x80000000 


Logical TRUE, sen/ice successful 




errcode 


Error number, service not successful 
(implementation dependent) 



Example Call 



fd: 



.word 



const 


gr96,fd 


; set address of 


consth 


gr96,fd 


; file descriptor 


load 


0,0,lr2,gr96 


; get file descriptor 


const 


grl21,18 


; service = 18 


asneq 


69,grl,grl 


; and call the OS 


jmpf 


grl21,clos_err 


; handle close error 


nop 




; 



The above example illustrates loading a previously 
stored file descriptor [fd, in this case) and calling the 
kernel's close service to close the file associated with 
that descriptor. If an error occurs when attempting to 



close the file, the kernel will return an error code in gr121 
(the content of that register will not be TRUE) and the 
program will jump to an error handler; otherwise, 
program execution will continue. 
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Service 19— read 



Read a Buffer of Data from a File 



Description 

This service reads a number of bytes from a previously 
opened file (identified by a small integer file descriptor in 
/r2that was returned by the open service) into memory 
starting at the address given by the buffer pointer in Ir3. 
Lr4 contains the number of bytes to be read. The num- 
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ber of bytes actually read is returned in gr96. Zero is 
returned in gr96\\ the file is already positioned at its end- 
of-file. If an error is detected, a small positive integer is 
returned in gr121, indicating the nature of the error. 



Type 



Regs 



Contents 



Description 



Calling: 



gri2i 


19(0x13) 


Service number 


Ir2 


fileno 


File descriptor 


Ir3 


buffptr 


A pointer to buffer area 


Ir4 


nbytes 


Number of bytes to be read 



Returns: 



gr96 



gr121 



count 



0x80000000 
errcode 



Success: > (number of bytes actually read) 

EOF: =0 

Failure: < 

Logical TRUE, service successful 

Error number, service not successful 

(implementation dependent) 



Example Call 



fd: 


.word 







.set 


BUFSIZE,256 


buf : 


.block 


BUFSIZE 


num: 


.word 







const 


gr96,fd 




consth 


gr96,fd 




load 


0,0,lr2,gr96 




const 


lr3,buf 




consth 


lr3,buf 




const 


lr4, BUFSIZE 




const 


grl21,19 




asneq 


69,grl,grl 




jmpf 


grl21,rd_err 




const 


grl20,num 




consth 


grl20,num 




store 


0,0,gr96,grl20 



set address of 
file descriptor 
get file descriptor 
set buffer address 

specify buffer size 
service = 19 
call the OS 
handle read errors 
set address of 
'num' argument 
store bytes read 



Theatx)ve example requests the HIFto return BUFSIZE 
bytes from the file descriptor contained in the variable fd. 
If the call is successful, gr121vj\\\ contain a TRUE value 



and gr96vj'\\\ contain the number of bytes actually read. 
If the service fails, gr121 will contain the error code. 
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Service 20— write 



Write a Buffer of Data to a File 



Description 

This service writes a number of bytes from memory 
(starting at the address given by the pointer in Ir3) into 
the file specified by the small positive integer file 
descriptor that was returned by the open sen/ice when 
the file was opened for writing. Lr4 contains the number 
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of bytes to be written. The number of bytes actually 
written is returned in gr96. If an error is detected, gr121 
will contain a small positive integer on return from the 
service, indicating the nature of the error. 



Type 



Regs 



Contents 



Description 



Calling: 



gr121 


20(0x14) 


Service number 


Ir2 


fileno 


File descriptor 


Ir3 


buffptr 


A pointer to the buffer area 


Ir4 


nbytes 


Number of bytes to be written 



Returns: 



gr96 



gr121 



count 



0x80000000 
errcode 



Success: = Ir4 
Failure: 0<gr96<lr4 
Extreme: < 

Logical TRUE, service successful 
Error number, service not successful 
(implementation dependent) 



Example Call 



fd: 


.word 









• set 


BUFSIZE,256 




buf : 


.block 


BUFSIZE 




num: 


.word 









const 


gr96,fd 


set address of 




consth 


gr96,fd 


file descriptor 




load 


0,0,lr2,gr96 


get file descriptor 




const 


lr3,buf 


set buffer address 




consth 


lr3,buf 






const 


lr4, BUFSIZE 


specify buffer size 




const 


grl21,20 


service = 20 




asneq 


69,grl,grl 


call the OS 




jitipf 


grl21, wr_err 


handle write errors 




const 


grl20, num 


set address of 




consth 


grl2 0, num 


"num" variable 




store 


0, 0,gr96,grl20 


store bytes written 



The example, above, writes BUFSIZE bytes from the 
buffer located at buf to the file associated with the 
descriptor stored in fd. If errors are detected during 
execution of the sen/ice, the value returned in gr121w\\ 



be FALSE. In this case, the wr_err error handler will be 
invoked. The number of bytes actually written is stored 
in the variable num. 
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Service 21— Iseek 

Description 

This service positions the file associated with the file 
descriptor in Ir2, "offset' number of bytes from the posi- 
tion of the file referred to by the orig parameter. Lr3 
contains the number of bytes offset and /r4 contains the 
value for orig. The parameter orig is defined as: 

= Beginning of the file 

1 = Current position of the file 

2 = End of the file 
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The Iseelt service can be used to reposition the file 
pointer anywhere in a file. The offset parameter may 
either be positive or negative. However, it is considered 
an error to attempt to seek in front of the beginning of the 
file. 



Type 



Regs 



Contents 



Description 



Calling: 



Returns: 



grl2l 


21 (0x15) 


Ir2 


file no 


Ir3 


offset 


Ir4 


orig 


gr96 


where 


gr121 


0x80000000 




errcode 



Service number 

File descriptor 

Number of bytes offset from orig 

A code number indicating the point within 

the file from which the offset is counted 

Success: > (current position in the file) 
Failure: < 

Logical TRUE, service successful 
Error number, sen/ice not successful 
(implementation dependent) 



Example Call 



fd: 


.word 


6 


• file descriptor = 6 


orig: 


.word 





origin = start of file 


off: 


.word 


23 


• offset = 23 bytes 




const 


gr96,fd 


set address of 




consth 


gr96,fd 


file descriptor 




load 


0, 0, Ir2,gr96 


get file descriptor 




const 


gr96,off 


set address of 




consth 


gr96,off 


offset argument 




load 


0,0,lr3,gr96 


get offset 




const 


gr96, orig 


set address of 




consth 


gr96, orig 


origin argument 




load 


0, 0, Ir4,gr96 


get origin 




const 


grl21,21 


service = 21 




asneq 


69,grl,grl 


call the OS 




jmpf 


grl21,seek err 


seek error if false 




nop 







The above example shows how a file can be positioned 
to a particular byte address by specifying the orig, which 
is the starting point from which the file position is 
adjusted, and the offset, which is the number of bytes 
from the origXo move the file pointer. In this case, the 



file identified by file descriptor 6 is being repositioned 
to byte 23, measured from the beginning of the file 
{orig=0). 
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The file descriptor, offset, and orig values are loaded gr121 is not TRUE, containing an error code that indi- 

f rom preset constants and Iseek is called to perform the cates the reason for the error. Upon return, gr96 also 

file positioning operation. If an error occurs when contains the file position measured from the beginning 

attempting to reposition the file, the value returned in of the file. In this case, this value Is not stored. 
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Service 22— remove 



Remove a File 



Description 

This service deletes a file from the file system. Lr2 
contains a pointer to the pathname of the file. The path 
must point to an existing file, and the referenced file 
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should not be currently open. The behavior of the 
remove service is undefined if the file is open. 



Type 



Regs 



Contents 



Description 



Calling: 



Returns: 



gr121 


22(0x16) 


Sen/ice number 


Ir2 


pathname 


A pointer to string that contains the 
pathname of the file 


gr96 


retval 


Success: = 
Failure: < 


gr121 


0x80000000 


Logical TRUE, service successful 




errcode 


Error number, service not successful 
(implementation dependent) 



Example Call 



path: 



.ascii 


/usr/:]ack 


'files/rr 


yfileXO" 


const 


lr2,path 




; set address of file 


consth 


lr2,path 




; pathname . 


const 


grl21,22 




; service = 22 


asneq 


69,grl,grl 




; call the OS 


jmpf 


grl21, rem err 


; jump if error 


nop 









In the above example, a file with a UNIX-style pathname 
stored in the string named path is being removed. The 
address (pointer) to the string is put into Ir2 and the 
kernel service 22 is called to remove the file. If the file 



does not exist, or if it has not previously been closed, an 
error code will be returned in fir/-r2r;otherwise, the value 
in gr121 will be TRUE. 
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Service 23— rename 

Description 

This service moves a file to a new location within the file 
system. /.r2 contains a pointer to the file's old pathname 
and Ir3 contains a pointer to the file's new pathname. 
When all components of the old and new pathnames are 
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Rename a File 



the same, except forthe filename, the file is said to have 
been renamed. The file identified by the old pathname 
must already exist, or an error code will be returned and 
the rename operation will not be performed. 



Type 



Regs 



Contents 



Description 



Calling: 



Returns: 



gr121 


23(0x17) 


Ir2 


oldfile 


Ir3 


newfile 


gr96 


retval 


gr121 


0x80000000 




errcode 



Service number 

A pointer to string containing the old pathname of the file 

A pointer to string containing the new pathname of the file 

Success: = 
Failure: < 

Logical TRUE, sen/ice successful 
Error number, sen/ice not successful 
(implementation dependent) 



Example Call 



old: 


.ascii 


"/usr/fred/pa; 


new: 


.ascii 


"/usr/fred/hi; 




const 


lr2,old 




consth 


lr2,old 




const 


lr3, new 




consth 


lr3,new 




const 


grl21,23 




asneq 


69,grl,grl 




jmpf 


grl21, ren_err 




nop 





; set address of old pathname 

; set address of new pathname 

; service = 23 (rename) 

; call the OS 

; jump if rename error 



The above example moves a file from its old path 
(renaming it in the process) to its new pathname loca- 
tion. The file will no longer be found at the old location. 
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Service 33— tmpnam 



Return Temporary Name 



Description 

This service generates a string ttiat can be used as a 
temporary file pathname. A different name is generated 
each time it is called. Generally, the name is guaranteed 
not to duplicate any existing filename. The argument 
passed in Ir2 should be a valid pointer to a buffer that is 
large enough to contain the constructed file name. HIF 
implementations are required to allocate a minimum of 
128 bytes for this purpose. 



and return a non-zero error number in global register 
gr121. 

The HIF specification sets no standards for the format or 
content of legal pathnames; these are determined by 
individual operating system requirements. However, 
each implementation should undertake to construct a 
valid filename that is also unique. 



If the argument in /a2 contains a NULL pointer, the HIF 
sen/ice routine should treat this as an error condition 
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Type 



Regs 



Contents 



Description 



Calling: 



Returns: 



gr121 


33(0x21) 


Ir2 


addrptr 


gr96 


filename 


gr121 


0x80000000 




errcode 



Service number 

A pointer to buffer into which the filename is to be stored 

Success: pointer to the temporary filename string. This will be 

the same as /r2on entry unless an error occurred 
Failure: = ( NULL pointer) 
Logical TRUE, service successful 
Error number, service not successful 
(implementation dependent) 



Example Call 



fbuf : 



.block 


21 


• buffer size = 


const 


lr2,fbuf 


• set buffer po 


consth 


lr2,fbuf 




const 


grl21,33 


service = 33 


asneq 


69,grl,grl 


call the OS 


jmpf 


grl21, tmp_err 


jump if error 


nop 







In the above example, the tmpnam service is called with 
a pointer to fbuf, which has been allocated to hold a 
name that is up to 21 bytes in length. If the service is able 
to construct a valid name, the filename will be stored in 



fbufvjhen the service returns. If the content of gr121 on 
return is not TRUE, the program fragment jumps to 
tmp_err to handle the error condition. 
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Service 49 — time 



Return Seconds Since 1970 



Description 

This service returns, in register gr96, tiie number of 
seconds elapsed since midnight, January 1 , 1 970, as an 
integer 32-bit value. It is assumed that the kernel service 
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will have access to a counter whose contents can be 
preloaded that measures time, with at least a one- 
second resolution, for this purpose. 



Type 



Regs 



Example Call 



Contents 



Description 



Calling: gr121 49(0x31) Service number 

Returns: gr96 sees Success: 5^ (time in seconds) 

Failure: = 
gr121 0x80000000 Logical TRUE, service successful 

errcode Error number, service not successful 

(implementation dependent) 



.word 



const 


grl21,49 


service =49 


asneq 


69,grl,grl 


call the OS 


jmpf 


grl21, tim_err 


jump if error 


const 


grl20, sees 


set the address 


consth 


grl20, sees 


for storing 'sees' 


store 


0,0,gr96,grl20 


store the seconds 



In the above example, the kernel service time is being 
called. If the value returned in gr121 is TRUE, the 
number of seconds returned in gr96\s stored in the sees 



variable; othenwise, the program jumps to tlm_err to 
determine the cause of the error. 
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Service 65— getenv 

Description 

This service searcties the system environment for a 
string associated with a specified symbol. Lr2 contains a 
pointerto the symbol name. If the symbol name is found, 
a pointer to the string associated with it is returned in 
gr96; othenwise, a NULL pointer is returned. 

In UNIX-hosted systems, the setenv command allows 
a user to associate a symbol with an arbitrary string. For 
example, the command 

setenv TERM vtlOO 

defines the string "vt100"to be associated with the sym- 
bol named TERM. Application programs can use this 
association to determine the type of terminal connected 
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to the system, and, therefore, use the correct set of 
codes when outputting information to the user's screen. 
To access the string, getenv should be called with Ir2 
pointing to a string containing the TERM symtx)! name. 
The address returned in gr96 will point to the corre- 
sponding "vtlOO" string if TERM is found. In UNIX- 
hosted systems, entering a different setenv command 
lets the user select a different terminal name without 
requiring recompilation of the application program. 

Operating system implementations that do not include 
provisions for environment variables should always 
return a NULL value in gr96 when this service is 
requested. 



Type 



Regs 



Contents 



Description 



Calling: 



Returns: 



gr121 


65(0x41) 


Ir2 


name 


gr96 


addrptr 


gr121 


0x80000000 




errcode 



Service number 

A pointer to the symbol name 

Success: pointer to the symbol name string 
Failure: = ( NULL pointer) 

Logical TRUE, service successful 
Error number, service not successful 
(implementation dependent) 



Example Call 



mysym: 


.ascii 


"MYSYMBOLXO" 


strptr : 


.word 







const 


lr2, mysym 




consth 


lr2, mysym 




const 


grl21,65 




asneq 


69,grl,grl 




jmpf 


grl21,env err 




const 


grl20, strptr 




consth 


grl20, strptr 




store 


0,0,gr96,grl20 



set address of symbol 

to be located in environment 

service = 65 

call the OS 

jump if error 

set address of 

string pointer 

store string pointer- 



The above example program calls the operating system 
getenv service to access a string associated with the 
environment variable MYSYMBOL. if the symbol is 
found, a pointer to the string associated with the symbol 



is returned in gr96. If the call is not successful (i.e., 
gr121 holds a FALSE boolean value upon return), the 
program jumps to env_err to handle the error condition. 
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Service 257— sysalloc 

Description 

This service allocates a specified number of contiguous 
bytes from the operating-system-maintained heap and 
returns a pointer to the base of the allocated block. Lr2 
contains the number of bytes requested. If the storage is 
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successfully allocated, gr96 contains a pointer to the 
block; othenwise, gr121 contains an error code indicat- 
ing the reason for failure of the call. 



Type 



Regs 


Contents 


gr121 


257(0x101) 


Ir2 


nbytes 


gr96 


addrptr 


gr121 


0x80000000 




errcode 



Description 



Calling: 



Returns: 



Service number 

Number of bytes requested 

Success: pointer to allocated bytes, 
Failure: = ( NULL pointer) 
Logical TRUE, service successful 
Error number, service not successful 
(implementation dependent) 



Example Call 



blkptr: 



.word 



const 


lr2, 1200 


; request 1200 bytes 


const 


grl21,257 


; service = 257 


asneq 


69,grl,grl 


; call the OS 


jmpf 


grl21,alloc_err 


; jump if error 


const 


grl20, blkptr 


; set address to store 


consth 


grl20, blkptr 


; pointer 


store 


0, 0,gr96,grl20 


; store the pointer 



The above example requests a block of 1200 contigu- 
ous bytes from the system heap. If the call is successful, 
the program stores the pointer returned in gr96 into a 



local variable called blkptr. If gr121 contains a boolean 
FALSE value when the service returns, the program 
jumps to alioc_err to handle the error condition. 
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Service 258— sysfree 

Description 

This service returns memory to the system starting at 
the address specified in Ir2. Lr3 contains the number of 
bytes to be released. The pointerpassed to the sysfree 
service in Ir2 and the byte count passed in Ir3 must 
match the address returned by a previous sysalloc 
service request for the identical number of bytes. No 

Register Usage 



Free Memory Space 



dynamic memory allocation structure is implied by this 
service. High-level language library functions such as 
mallocO and free() for the C language are required to 
manage random dynamic memory block allocation and 
deallocation, using the sysalloc and sysfree kernel 
functions as their basis. 



Type 



Regs 



Example Call 



Contents 



Description 



Calling: gr121 258(0x102) Service number 

Ir2 addrptr Starting address of area returned 

Ir3 nbytes Number of bytes to release 

Returns: gr96 retval Success: = 

Failure: < 

gr121 0x80000000 Logical TRUE, service successful 

errcode Error number, service not successful 

(implementation dependent) 



blkptr: 



.word 



const 


grl20, blkptr 


set address of previously 


consth 


grl20, blkptr 


block pointer 


load 


0,0,lr2,grl20 


fetch pointer to block 


const 


lr3,1200 


set number of bytes to release 


const 


grl21,258 


service = 258 


asneq 


69,grl,grl 


call the OS 


jmpf 


grl21, free_err 


jump if error 


nop 







The above example calls sysfree to deallocate 1 200 
bytes of contiguous memory, beginning at the address 
stored in the W/cp/r variable. If the call is successful, the 



program continues; othenwise, if the return value in 
gr121 is FALSE, the program jumps to free_err to 
handle the error condition. 
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Service 259— getpsize 



Return Memory Page Size 



Description 

This service returns, in register gr96, the page size, in 
bytes, used by the memory system of the HI F implemen- 
tation. 

Register Usage 



Type 


Regs 


Contents 


Description 


Calling: 


gr121 


259(0x103) 


Service number 


Returns: 


gr96 


pagesize 


Success: memory page size, one of the following: 

1024, 2048, 4096, and 8192 
Failure: < 




gr121 


0x80000000 


Logical TRUE, sen/ice successful 






errcode 


Error number, service not successful 
(implementation dependent) 


Example 


Call 






pagsiz: 


.word 









const 


grl21,259 


; service =25 9 




asneq 


69,grl,grl 


; call the OS 




jmpf 


grl21,pag_err 


; jump if error 




const 


grl20, pagsiz 


; set address to 




consth 


grl20, pagsiz 


; store the page size 




store 


0,0,gr96,grl20 


; store it ! 



The at)Ove example calls the operating system kernel to 
return the page size used by the virtual memory system. 
If the call was successful, gr121 will contain a boolean 
TRUE result and the program will store the value in gr96 



into the pags/z variable; othenwise, a boolean FALSE is 
returned in gr121. In this case, the program will jump to 
pag_err to handle the error condition. 
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Service 260— getargs 

Description 

This service returns the base address of the command- 
line-argument vector argv in register gr96, as con- 
stojcted by the operating system kernel when an 
application program is invoked. 

Arguments are stored by the operating system as a 
series of NULL-terminated character strings. A pointer 
containing the address of each siring is stored in an 

Register Usage 



Return Base Address 



array whose base address (referred to as argv) is 
returned by the getargs HIF. service. The last entry in 
the array contains a NULL pointer (an address consist- 
ing of all zero bits). The number of arguments can be 
computed by counting the number of pointers in the 
array, using the fact that the NULL pointer terminates 
the list. 



Type 



Regs 



Example Call 



Contents 



Description 



Calling: gr121 260(0x104) Service number 

Returns: gr96 baseaddr Success: base address of argv 

Failure: = ( NULL pointer) 

gr121 0x80000000 Logical TRUE, service successful 

errcode Error number, service not successful 

(implementation dependent) 



argptr ; 



.word 







const 


grl21,260 


; service =2 60 


asneq 


69,grl,grl 


; call the OS 


jmpf 


grl21,bas_err 


; jump if error 


const 


grl20, argptr 


; set address where base 


consth 


grl20, argptr 


; pointer is to be stored 


store 


0,0,gr96,grl20 


; store the pointer 



The above example calls operating system service 260 
to access the command-line-argument vector address. 
If the service executes without error, the program 
continues by storing the argument vector address in the 



variable basptr. If gr121 contains a boolean FALSE 
value upon return, the program jumps to bas_err to 
handle the error condition. 
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Service 273 — clock 



Return Time in Milliseconds 



Description 

This service returns the elapsed processor time in milli- 
seconds. Operating system initialization procedures set 
this value to zero on startup. Successive calls to this 

Register Usage 



service retum times that can be arithmetically sub- 
tracted to accurately measure time intervals. 



Type 



Regs 



Contents 



Description 



Calling: 



gr121 



273(0x111) 



Service number 



Returns: gr96 msecs Success: ^ (time in milliseconds) 

Failure: = 
gr121 0x80000000 Logical TRUE, service successful 

errcode Error number, service not successful 

(implementation dependent) 



Example Call 



time; 



.word 



const 


grl21,273 


; service = 273 


asneq 


69,grl,grl 


; call the OS 


jmpf 


grl21,clk_err 


; jump if error 


const 


grl20,time 


; set the address where 


consth 


grl20,time 


; time is to be stored 


store 


0,0,gr96,grl20 


; store the time in ms . 



The above example calls the operating system kernel to 
get the current value of the system clock in milliseconds. 
On return, if gr121 contains a boolean FALSE value, the 



program jumps to clk_err to handle the error; othenwise, 
the time in milliseconds is stored in the variable time. 
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Service 274 — cycles 

Description 

This service returns an ascending positive number in 
registers gr96 and gr97\hal is the number of processor 
cycles that have elapsed since the last hardware 
RESET was applied to the CPU. It provides a mecha- 
nism for user programs to access the contents of the 
internal Am29000 timer counter register. The cycle 

Register Usage 



Return Processor Cycles 



count can be multiplied by the speed of the processor 
clock to convert it to a time value. Gr97will contain the 
most significant bits of the cycle count, while gr96 will 
contain the least significant bits. HIF implementations of 
this service are required to provide a cycle count with a 
minimum of 56 bits of precision. 



Type 



Regs 



Example Call 



Contents 



Description 



Calling: gr121 274(0x112) Service number 

Returns: gr96 cycles Success: Bits 0-31 of processor cycles 

Failure: = (\n bo\h gr96 and gr97) 

gr97 cycles Success: Bits 32-55 of processor cycles 

Failure: = (in both gr96 and gr97) 

gr121 0x80000000 Logical TRUE, service successful 

errcode Error number, service not successful 

(implementation dependent) 



cycles : 



.word 





.word 





const 


grl21,274 


asneq 


69,grl,grl 


jmpf 


grl21, cyc_err ; 


const 


grl20, cycles ; 


consth 


grl20, cycles ; 


Store 


0,0,gr97,grl20 


add 


grl20,grl20,4 


store 


0,0,gr96,grl20 



MSBs of cycles 
LSBs of cycles 

service = 274 

call the OS 

jump if error 

set the address where the 

count is to be stored 

store the MSBs, 

increment the address, 

then store the LSBs of cycles, 



The above example program fragment calls the operat- 
ing system sen/ice 274 to access the number of CPU 
cycles that have elapsed since it was powered on. The 
cycle count (in gr96 and gr97) is stored in the two words 



addressed by the variable cycles if the service call is 
successful. If gr12 1 contains a t)00lean FALSE value on 
exit, the program jumps to cyc_err to handle the error 
condition. 
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Service 289— setvec 



Set User Trap Address 



Description 

This service sets the address for user-level trap handler 
services that implement the local register stack spill and 
fill traps. It returns an indication of success or failure in 

Register Usage 



register gr96. The method used to invoke these traps in 
user mode is described on page 6 of this specification, in 
the "User-Mode Traps" section. 



Type 



Regs 


Contents 


Description 


gr121 

Ir2 

Ir3 


289(0x121) 

trapno 

funaddr 


Service number 

trap number 

address of user trap handler 


gr96 


retval 


Success: = 
Failure: < 


gr121 


0x80000000 
errcode 


Logical TRUE, service successful 
Error number, service not successful 
(implementation dependent) 



Calling: 



Returns: 



Example Call 



trpadr : 



.word 







const 


lr2,64 


trap number = 64 


const 


Ir3,t64_hnd 


set address of 


consth 


Ir3,t64 hnd 


trap-64 handler 


const 


grl21,289 


service = 289 


asneq 


69,grl,grl 


call the OS 


jmpf 


grl21, vec_err 


jump if error 


const 


grl20, trpadr 


set address where to 


consth 


grl20, trpadr 


store the trap address 


store 


0, 0,gr96,grl20 


and store it! 



The above example calls the setvec service to pass the 
address to be used for the trap 64 trap handier routine. If 
the service returns with gr121 containing a boolean 



TRUE result, the program continues by storing the trap 
address returned in gr96; othenwise, the program jumps 
to vec err to handle the error condition. 
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PROCESS ENVIRONMENT 

There are standard memory and register initializations 
that must be performed by a HIF-conforming l<ernel 
before entry to a user program. In C-ianguage 
programs, this is usually performed by the module crto. 
This module receives control when an application 
program is invoked, and executes prior to invocation of 
the user's main function. Other high-level languages 
have similar modules. 

STARTUP INITIALIZATION 

Initialization procedures must establish appropriate 
values for the general registers mentioned below. In 
addition, file descriptors for the standard input and out- 
put devices must be opened. 

Register Stack Pointer (gr1) 

The register stack pointer {RSP) register contains the 
main memory address in which the local register /rOwill 
be saved, and from which it will be restored. The content 
of RSP is compared to the content of RAB\o determine 
when it is necessary to spill part of the local register 
stack to memory. On startup, the values in RAB, RSP 
and RFS should be initialized to prevent a spill trap from 
occurring on entry to the crtO code, as shown by the 
following relation: 

(RAB + 256) RSP RFB 

This provides the crtO code with at least 64 registers on 
entry, which should be a sufficient number to accom- 
plish its purpose. 

Memory Stack Pointer (gr125) 

The memory stack pointer (MSPj register points to the 
top of the memory stack, or the lowest-addressed entry 
on the memory stack. This register must be preserved 
(or, more conventionally, restored). 



Register Allocate Bound (grl26) 

The register allocate t)Ound {RAB) register contains the 
register stack address of the lowest-addressed word 
contained within the register file. RAB is referenced in 
the prolog of most user program functions to determine 
whether a register spill operation is necessary to accom- 
modate the local register requirements of the called 
function. 

Register Free Bound (grl27) 

The register free bound {RFB) register contains the 
register stack address of the lowest-addressed word not 
containedwithinthe registerfile (andgreaterthan RAB). 
RFB is referenced in the epilog of most user program 
functions to determine whether a register fill operation is 
necessary to restore previously spilled registers needed 
by the function's caller. 

Open File Descriptors 

File descriptor (corresponding to the standard input 
device) must be opened for text mode input. File 
descriptors 1 and 2 (corresponding to standard output 
and standard error devices) must be opened for text 
mode output prior to entry to the user's program. 

PROGRAIVI TERMINATION 

The only valid way for an application to terminate execu- 
tion is by calling the exit service. Most high-level 
languages provide this capability, even if the program- 
mer does not explicitly invoke a corresponding library 
function. 

TRAP HANDLERS 

The trap vector entries shown in Table 5 must be 
installed, and corresponding handlers must be 
provided. 
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Table 5. Trap Handler Vectors 



Trap 


Description 


32 


MULTIPLY 


33 


DIVIDE 


34 


MULTIPLU 


35 


DIVIDU 


36 


CONVERT 


42 


FEQ 


43 


DEQ 


44 


FGT 


45 


DGT 


46 


FGE 


47 


DGE 


48 


FADD 


49 


DADD 


50 


FSUB 


51 


DSUB 


52 


FMUL 


53 


DMUL 


54 


FDIV 


55 


DDIV 


64 


Spill (Set up by the user's task through a setvec call) 


65 


Fill (Set up by the user's task through a setvec call) 


69 


HIF System Call 



Note: The Spill (64) and Fill (65) traps are returned to the user's code to perform the trap handling functions in user 
mode, as described in the "User Mode Traps" section. 
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APPENDIX A: HIF QUICK REFERENCE 

Table 6 lists the HIF service calls, calling parameters, means the register is not used or is undefined. Table 7 
and the returned values. If a column entry is blank, it describes the parameters given in Table 6. 

Table 6. HIF Service Calls 



Service 
Title 


Callinq Parameters 


Returned Values 


GR121 


LR2 


LR3 


LR4 


GR96 GR97 


GR121 


exit 


1 


ex it code 










open 


17 


pathname 


mode 


pfiag 


fileno 


errcode 


close 


18 


fileno 






retval 


errcode 


read 


19 


fileno 


buffptr 


nbytes 


count 


errcode 


write 


20 


fileno 


buffptr 


nbytes 


count 


errcode 


Iseek 


21 


fileno 


offset 


orig 


where 


errcode 


remove 


22 


pathname 






retval 


errcode 


rename 


23 


oldfile 


newfile 




retval 


errcode 


tmpnam 


33 


addrptr 






filename 


errcode 


time 


49 








sees 


errcode 


getenv 


65 


name 






addrptr 


errcode 


sysalloc 


257 


nbytes 






addrptr 


errcode 


sysfree 


258 


addrptr 


nbytes 




retval 


errcode 


getpsize 


259 








pagesize 


errcode 


getargs 


260 








baseaddr 


errcode 


clock 


273 








msecs 


errcode 


cycles 


274 








LSBs cycles MSBs cycles 


errcode 


setvec 


289 


trapno 


funaddr 




retval 


errcode 
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Table 7. Service Call Parameters 



Parameter Description 



addrptr A pointer to an allocated memory area, a command-line-argument array, a pathname buffer, or a 

NULL-terminated environment variable name string, 
baseaddr The base address of the command-line-argument vector, 

buffptr A pointer to the buffer area where data is to be read from or written to during the execution of I/O 

services, 
count The number of bytes actually read from file or written to a file, 

cycles The number of processor cycles (returned value). 

errcode The error code returned by the service. These are usually the same as the codes returned in the UNIX 

errno variable. See Appendix B, Table 8, for a list of HIF error codes. 

exitcode The exit code of the application program. 

filename A pointer to a NULL-terminated ASCII string that contains the directory path of a temporary filename. 

f ileno The file descriptor which is a small integer number. File descriptors 0, 1 , and 2 are guaranteed to exist 

and correspond to open files on program entry (0 refers to the UNIX equivalent of stdin and is opened 

for input; 1 refers to the UNIX stdout, and is opened for output; 2 refers to the UNIX stderr, and is 

opened for output), 
funaddr A pointer to the address of a function. 

mode A series of option flags whose values represent the operation to be performed, 

msecs fvlilliseconds. 

name A pointer to a NULL-terminated ASCII string that contains an environment variable name, 

nbytes The number of data bytes requested to be read from or written to a file, or the number of bytes to 

allocate from the heap, 
newfile A pointer to a NULL-terminated ASCII string that contains the directory path of a new filename, 

offset The number of bytes from a specified position {orig) in a file. 

oldfile A pointer to NULL-terminated ASCII string that contains the directory path of the old filename, 

orig A value of 0, 1 , or 2 that refers to the beginning, the current position, or the position of the end of a file, 

pagesize The memory page size in bytes (returned val). 

pathname A pointer to a NULL-terminated ASCII string that contains the directory path of a filename, 

pflag The UNIX file access permission codes, 

retval The return value that indicates success or failure, 

sees The seconds count returned, 

trapno The trap number, 

where The current position in a specified file. 
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APPENDIX B: ERROR NUMBERS 

HIF implementations are required to return error codes 
when a requested operation is not possible. The codes 
from to 255 are reserved for compatibility with current 
and future error return standards. The currently 
assigned codes and their meanings are shown in 



Table 8. If a HIF implementation returns an error code in 
the range of to 255 , it must carry the identical meaning 
to the corresponding error code in this table. Error code 
values largerthan 255 are available for implementation- 
specific errors. 



Table 8. HIF Error Numbers Assigned 



Number Error Name 



1 EPERM 

2 ENOENT 

3 ESRCH 

4 EINTR 

5 EIO 

6 ENXIO 

7 E2BIG 

8 ENOEXEC 

9 EBADF 

10 ECHILD 

1 1 EAGAIN 

12 ENOMEM 



Description 



Not used. 

Not owner 

This error indicates an attempt to modify a file in some way forbidden except to 
its owner. 

No such file or directory 

This error occurs when a file name is specified and the file should exist but 
does not, or when one of the directories in a pathname does not exist. 

No such process 

The process or process group whose number was given does not exist, or any 

such process is already dead. 

Interrupted system call 

This error indicates that an asynchronous signal (such as interrupt or quit) that 

the user has elected to catch occurred during a system call. 

I/O error 

Some physical I/O error occurred during a read or write. This error may in 

some cases occur on a call following the one to which it actually applies. 

No such device or address 

I/O on a special file refers to a sub-device that does not exist or is beyond the 

limits of the device. 

Arg list is too long 

An argument list longer than 5120 bytes is presented to execve. 

Exec format error 

A request is made to execute a file that, although it has the appropriate permis- 
sions, does not start with a valid magic number. 

Bad file number 

Either a file descriptor refers to no open file, or a read (write) request is made to 

a file that is open only for writing (reading). 

No children 

Wait and the process has no living or unwaited-for children. 

No more processes 

In a fork, the system's process table is full, or the user is not allowed to create 

any more processes. 

Not enough memory 

During an execve or break, a program asks for more memory than the system 

is able to supply or else a process size limit would be exceeded. 
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Table 8. HiF Error Numbers Assigned (continued) 



Number Error Name 



Description 



13 EACCESS 

14 EFAULT 

15 ENOTBLK 

16 EBUSY 

17 EEXIST 

18 EXDEV 

19 ENODEV 

20 ENOTDIR 

21 EISDIR 

22 EINVAL 

23 ENFILE 

24 EMFILE 

25 ENOTTY 

26 ETXTBSY 

27 EFBIG 



Permission denied 

An attempt was made to access a file in a way forbidden by the protection 

system. 

Bad address 

The system encountered a hardware fault in attempting to access the argu- 
ments of a system call. 

Block device required 

A plain file was mentioned where a block device was required, such as in 
mount. 

Device busy 

An attempt was made to mount a device that was already mounted, or an 
attempt was made to disnwunt a device on which there is an active file (open 
file, current directory, mounted-on file, or active text segment). 

File exists 

An existing file was mentioned in an inappropriate context, e.g., link. 

Cross-device link 

A hard link to a file on another device was attempted. 

No such device 

An attempt was made to apply an inappropriate system call to a device, e.g., to 

read a write-only device, or the device is not configured by the system. 

Not a directory 

A non-directory was specified where a directory is required, for example, in a 

path name or as an argument to chdir. 

Is a directory 

An attempt to write on a directory. 

Invalid argument 

This error occurs when some invalid argument for the call is specified. For 
example, dismounting a non-mounted device, mentioning an unknown signal 
in signal, or specifying some other argument that is inappropriate for the call. 

File table overflow 

The system's table of open files is full, and temporarily no nrare open requests 

can be accepted. 

Too many open files 

The configuration limit on the number of simultaneously open files has been 

exceeded. 

Not a typewriter 

The file mentioned in stty or gtty is not a terminal or one of the other devices to 

which these calls apply. 

Text file busy 

The referenced text file is busy and the current request can not be honored. 

File too large 

The size of a file exceeded the maximum limit. 
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Table 8. HIF Error Numbers Assigned (continued) 



Number Error Name 



Description 



28 ENOSPC 



29 



30 



31 



32 



33 



ESPIPE 



EROFS 



EMLINK 



EPIPE 



EDOM 



34 


ERANGE 


35 


EWOULDBLOCK 


36 


EINPROGRESS 


37 


EALREADY 


38 


ENOTSOCK 


39 


EDESTADDRREQ 


40 


EMSGSIZE 


41 


EPROTOTYPE 



No space left on device 

A write to an ordinary file, the creation of a directory or symbolic link, or tfie 
creation of a directory entry failed because no more disk blocks are available 
on the file system. 

Illegal seek 

A seek was issued to a socket or pipe. This error may also be issued for other 

non-seekable devices. 

Read-only file system 

An attempt to modify a file or directory was made on a device nwunted read- 
only. 

Too many links 

An attempt was made to establish a new link to the requested file and the limit 

of simultaneous links has been exceeded. 

Broken pipe 

A write on a pipe or socket was attempted for which there is no process to read 
the data. This condition normally generates a signal; the error is returned if the 
signal is caught or ignored. 

Argument too large 

The argument of a function in the math package is out of the domain of the 

function. 

Result too large 

The value of a function in the math package is unrepresentable within machine 

precision. 

Operation would block 

An operation that would cause a process to block was attempted on an object 
in non-blocking mode. 

Operation now in progress 

An operation that takes a long time to complete was attempted on a non-block- 
ing object. 

Operation already in progress 

An operation was attempted on a non-blocking object that already had an 

operation in progress. 

Socket-operation on non-socket 

A socket-oriented operation was attempted on a non-socket device. 

Destination address required 

A required address was omitted from an operation on a socket. 

Message too long 

A message sent on a socket was larger than the internal message buffer or 

some other network limit. 

Protocol wrong type for socket 

A protocol was specified that does not support the semantics of the socket type 

requested. 
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Table 8. HIF Error Numbers Assigned (continued) 



Number Error Name 



Description 



42 ENOPROTOOPT Option not supported by protocol 

A bad option or level was specified when accessing socket options. 

43 EPROTONOSUPPORT Protocol not supported 

The protocol has not been configured into the system, or no implementation for 



44 ESOCKTNOSUPPORT 



45 EOPNOTSUPP 



46 EPFNOSUPPORT 



47 EAFNOSUPPORT 



48 EADDRINUSE 



49 EADDRNOTAVAIL 



50 ENETDOWN 



51 ENETUNREACH 



52 ENETRESET 



53 ECONNABORTED 



54 ECONNRESET 



55 ENOBUFS 



56 EISCONN 



it exists. 

Socket type not supported 

The support for the socket type has not been configured into the system, or no 

implementation for it exists. 

Operation not supported on socket 

For example, trying to accept a connection on a datagram socket. 

Protocol family not supported 

The protocol family has not been configured into the system or no implementa- 
tion for it exists. 

Address family not supported by protocol family 

An address was used that is incompatible with the requested protocol. 

Address already in use 

Only one usage of each address is normally permitted. 

Cannot assign requested address 

This normally results from an attempt to create a socket with an address not on 
this machine. 

Network is down 

A socket operation encountered a dead network. 

Network is unreachable 

A socket operation was attempted to an unreachable network. 

Network dropped connection on reset 

The host you were connected to crashed and rebooted. 

Software caused connection abort 

A connection abort was caused internal to your host machine. . 

Connection reset by peer 

A connection was forcibly closed by a peer. This normally results from a loss of 

the connection on the remote socket due to a timeout or a reboot. 

No buffer space available 

An operation on a socket or pipe was not performed because the system 
lacked sufficient buffer space or because a queue was full. 

Socket is alreiady connected 

A connect request was made on an already connected socket; or a sendto or 

sendmsg requesl on a connected socket specified a destination when already 

connected. 
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Table 8. HiF Error Numbers Assigned (continued) 



Number Error Name 



Description 



57 ENOTCONN 

58 ESHUTDOWN 

59 ETOOMANYREFS 

60 ETIMEDOUT 

61 ECONNREFUSED 

62 ELOOP 

63 ENAMETOOLONG 

64 EHOSTDOWN 

65 EHOSTUNREACH 

66 ENOTEMPTY 

67 EPROCLIM 

68 EUSERS 

69 EDQUOT 

70 EVDBAD 



Socket is not connected 

A request to send or receive data was disallowed because tine socket was not 
connected and (when sending on a datagram socket) no address was 
supplied. 

Cannot send after socket shutdown 

A request to send data was disallowed because the socket had already been 

shut down with a previous shutdown call. 

Too many references; cannot splice. 

Connection timed out 

A connect or send request failed because the connected party did not properly 
respond after a period of time. (The timeout period is dependent on the 
communication protocol.) 

Connection refused 

No connection could be made because the target machine actively refused it. 
This usually results from trying to connect to a service that is inactive on the 
foreign host. 

Too many levels of symbolic links 

A pathname lookup involved more than the maximum limit of symbolic links. 

File name too long 

A component of a pathname exceeded the maximum name length, or an entire 
pathname exceeded the maximum path length. 

Host is down 

A socket operation failed because the destination host was down. 

Host is unreachable 

A socket operation was attempted to an unreachable host. 

Directory not empty 

A non-empty directory was supplied to a remove directory or rename call. 

Too many processes 

The limit of the total number of processes has been reached. No new 

processes can be created. 

Too many users 

The limit of the total number of users has been reached. No new users may 

access the system. 

Disk quota exceeded 

A write to an ordinary file, the creation of a directory or symbolic link, or the 
creation of a directory entry failed because the user's quota of disk blocks was 
exhausted; or the allocation of an /node for a newly created file failed because 
the user's quota of /nodes was exhausted. 

RVD related disk error 
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Additional Support Literature 

The following is a list of AMD 29K Family literature that can be ordered from your local AMD Sales Representative 
or the Literature Distribution Center at (800) 222-9323, extension 5000; inside California, call (408) 749-5000. 
Technical and marketing information concerning the 29K Family also can be obtained by calling the 29K Hotline at 
(800) 2929-AMD. 

Order No. Title 

09548 Am29000 Article Reprint Brochure 

1 0344 Am29000 Family Overview Brochure 

1 0345 29K Support Products Brochure 

10620 Am29000 User's Manual 

10621 Am29000 Performance Analysis Brochure 
10623 Am29000 Memory Design Handbook 

1 1 426 Fusion 29K Catalog 

1 1 852 Am29027 Handbook 
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CGX169 




PIO tt 07322B 



'For reference only. 



*For reference only. All dimensions are measured in inches. BSC is an ANSI standard for Basic Space Centering. 
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Sales Offices 



North American 

ALABAMA (205) 882-9122 

ARIZONA (602) 242-4400 

CALIFORNIA, 

Culver City (213) 645-1524 

Newport Beach (714 752-6262 

Roseville (916 786-6700 

San Diego (619) 560-7030 

San Jose (408) 452-0500 

Woodland Hills (818) 992-4155 

CANADA. Ontario. 

Kanata (613) 592-0060 

Willowdale (416) 224-5193 

COLORADO (303) 741-2900 

CONNECTICUT (203) 264-7800 

FLORIDA, 

Clearwater (813) 530-9971 

Ft. Lauderdale (305) 776-2001 

Orlando (Casselberry) (407) 830-8100 

GEORGIA (404) 449-7920 

ILLINOIS, 

Chicago (Itasca) (312) 773-4422 

Naperville (312) 505-9517 

KANSAS (913) 451-3115 

MARYLAND (301) 796-9310 

MASSACHUSETTS (617) 273-3970 

MICHIGAN (313) 347-1522 

MINNESOTA (612) 938-0001 

NEW JERSEY, 

Cherry Hill (609) 662-2900 

Parsippany (201) 299-0002 

NEW YORK, 

Liverpool (315) 457-5400 

Poughkeepsie (914) 471-8180 

Rochester (716) 272-9020 

NORTH CAROLINA (919) 878-8111 

OHIO, 

Columbus (Westerville) (614) 891-6455 

Dayton (513) 439-0470 

OREGON (503) 245-0080 

PENNSYLVANIA (215) 398-8006 

SOUTH CAROLINA (803) 772-6760 

TEXAS, 

Austin (512) 346-7830 

Dallas (214) 934-9099 

Houston (713) 785-9001 

International 



International (Continued). 
KOREA, Seoul TEL. 



FRANCE, Paris. 



BELGIUM, Bruxelles TEL (02) 771-91-42 

FAX (02) 762-37-12 

TLX 846-61028 

TEL (1) 49-75-10-10 

FAX (1) 49-75-10-13 

TLX 263282F 

WEST GERMANY, 

Hannover area TEL (0511) 736085 

FAX (0511) 721254 

TLX 922850 

Munchen TEL (089) 4114-0 

FAX (089) 406490 

TLX 523883 

Stuttgart TEL (0711) 62 33 77 

FAX (0711) 625187 

TLX 721882 

HONG KONG TEL 852-5-8654525 

Wanchai FAX 852-5-8654335 

TLX 67955AMDAPHX 

ITALY, Milan TEL (02) 3390541 

(02) 3533241 

FAX (02) 3498000 

TLX 843-315286 

JAPAN, 

Kanagawa TEL 462-47-2911 

FAX 462-47-1729 

Tokyo TEL (03) 345-8241 

FAX (03) 342-5196 

TLX J24064AMDTKOJ 

Osaka TEL 06-243-3250 

FAX 06-243-3253 



NORWAY, Hovik. 



822-784-0030 

FAX 822-784-8014 

LATIN AMERICA, 

Ft. Lauderdale TEL (305) 484-8600 

FAX (305) 485-9736 

TLX 5109554261 AMDFTL 

.TEL (03) 010156 

FAX (02) 591959 

TLX 79079HBCN 

SINGAPORE TEL 65-3481188 

FAX 65-3480161 

TLX 55650 AMDMMI 

SWEDEN. 

Stockholm TEL (08) 733 03 50 

(Sundbyberg) FAX (08) 733 22 85 

TLX 11602 

TAIWAN TEL 886-2-7213393 

FAX 886-2-7723422 

TLX 886-2-7122066 

UNITED KINGDOM, 

Manchester area TEL (0925) 828008 

(Warrington) FAX (0925) 827693 

TLX 851-628524 

London area TEL 0483) 740440 

(Woking) FAX (0483) 756196 

TLX 851-859103 

North American Representatives 

CANADA 
Burnaby, B.C. 

DAVETEK MARKETING (604) 430-3680 

Calgary, Alberta 

DAVETEK MARKETING (403) 291-4984 

Kanata, Ontario 

VITEL ELECTRONICS (613) 592-0060 

Mississauga, Ontario 

VITEL ELECTRONICS (416) 676-9720 

Lachine. Quebec 

VITEL ELECTRONICS (514) 636-5951 

IDAHO 

INTERMOUNTAIN TECH MKTG, INC (208) 888-6071 

ILLINOIS 

HEARTLAND TECH MKTG. INC (312) 577-9222 

INDIANA 

Huntington - ELECTRONIC MARKETING 

CONSULTANTS. INC (317) 921-3450 

Indianapolis - ELECTRONIC MARKETING 

CONSLILTANTS. INC (317) 921-3450 

IOWA 

LORENZ SALES (319) 377-4666 

KANSAS 

Merriam- LORENZ SALES (913) 384-6556 

Wichita -LORENZ SALES (316) 721-0500 

KENTUCKY 

ELECTRONIC MARKETING 

CONSULTANTS, INC (317) 921-3452 

MICHIGAN 

Birmingham - MIKE RAICK ASSOCIATES ..(313) 644-5040 

Holland -COM-TEK SALES. INC (616) 399-7273 

Novi -COM-TEK SALES. INC (313 344-1409 

MISSOURI 

LORENZ SALES (314) 997-4558 

NEBRASKA 

LORENZ SALES (402) 475-4660 

NEW MEXICO 

THORSON DESERT STATES (505) 293-8555 

NEW YORK 

East Syracuse - NYCOM, INC (315) 437-8343 

Woodbury -COMPONENT 

CONSULTANTS, INC (516) 364-8020 

OHIO 

Centerville-DOLFUSS ROOT & CO (513) 433-6776 

Columbus -DOLFUSS ROOT & CO (614) 885-4844 

Strongsviile-DOLFUSS ROOT & CO (216 238-0300 

PENNSYLVANIA 

DOLFUSS ROOT & CO (412) 221-4420 

PUERTO RICO 

COMP REP ASSOC. INC (809) 746-6550 

UTAH. R2 marketing (801) 595-0631 

WASHINGTON 

ELECTRA TECHNICAL SALES (206) 821-7442 

WISCONSIN 

HEARTLAND TECH MKTG. INC (414) 792-0920 



Advanced Micro Devices reserves the right to make changes in Its product without notice in order to improve design or performance characteristics. The performance 
characteristics listed in this document are guaranteed by specific tests, guard banding, design and other practices common to the industry. For specific testing details, 
contact your local AMD sales representative. The company assumes no responsibility for the use of any circuits described herein. 
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